[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

Re: Deep contexts WAS: Re: [ba-ohs-talk] Re: A modest proposal

> What are the scopes of S1 and S2's baseNames though? Well, they appear
> to have the same baseName in the same scope so maybe they
> should be merged too? And so on.
> Is there an end to the regress?    (01)

If the merge is not automagic, then
You do the merge manually and submit the result to the group.
If accepted, it becomes the new mapping.
If the merge is automagic, then
You untangle the merge and submit the result to the group.
if accepted, it becomes the new mapping.    (02)

Best Regards,
Joe    (03)

----- Original Message -----
From: "Peter Jones" <ppj@concept67.fsnet.co.uk>
To: <ba-ohs-talk@bootstrap.org>
Sent: March 23, 2002 5:29 AM
Subject: Deep contexts WAS: Re: [ba-ohs-talk] Re: A modest proposal    (04)

> Hi Murray,
> OK, let me try and explain the idea of deep contexts some more first,
> then I hope to be able to present the argument as to why
> the division of the example topic you've provided into 4 topics is not
> really making any great change for the user, although it
> does make a big change behind the scenes for processing and system
> significance.
> {Warning: Very long post riding roughshod over sensitive terrain in
> places.}
> Deep contexts:
> In XTM 1.0 and ISO13250 there was a 'catch' with scoping topics.
> Imagine you have TM1 that contains a topic T1, 'Cecil the cat', in scope
> S1, 'house 51', and a TM2 that contains a topic, C2, 'Cecil the cat' in
> scope, S2, 'house 51'.
> TM1 and TM2 are separate documents ostensibly about different subjects.
> I load TM1 into my TM processor. I surf it's contents and via one of its
> occurrences I am accidentally led to click on a link that imports TM2
> into the processor.
> T1 and T2 appear to be about the same thing, and they appear to be in
> the same scope. So let's try merging them.
> Topic maps say that the same baseName in the same scope indicates a
> topic that should be merged.
> So it looks like T1 and T2 should be merged. But how do we determine
> whether S1 and S2 are in fact the same scope.
> Well, they appear to have the same baseName in the same scope, so maybe
> they should be merged too.
> What are the scopes of S1 and S2's baseNames though? Well, they appear
> to have the same baseName in the same scope so maybe they
> should be merged too? And so on.
> Is there an end to the regress?
> There are three methods which can be combined.
> 1-  Restrict the application TM set so that you know that there is an
> upper bound in advance.
> 2-  Don't look too closely at the scoped properties of scoping topics
> (or put otherwise, things aren't so bad if every topic has a baseName in
> the unconstrained scope or has a Subject that is a PSI - which is the
> same thing). We are "strongly encouraged to use common Published Subject
> Indicators" in TMPM4.
> On the subject of PSIs TMPM4 says, "Such organizations [that serve
> communities of interest] should commit themselves to preserving the
> longterm validity of the published addresses of such identity points, in
> order to protect the value and mergeability of the topic maps that use
> them."
> 3-  Ignore the Name merger rule and only go for merger when a topic
> Subject is the same resource address in both cases.
> Although cf. section 7 of TMPM4 at www.topicmaps.net
> "The Subject-based Merging Rule requires conforming topic map processing
> systems to merge t-nodes that are known to such systems to have the same
> subject, *on the basis of whatever information is available to them*. In
> addition, the Subject-based Merging Rule requires conforming topic map
> processing systems to conclude, on the basis of *certain conditions*,
> that two t-nodes have the same subject, and that they therefore must be
> merged into a single t-node." [my added emphasis]
> What are the conditions TMPM4 proposes? To quote at length:
> "Whenever two t-nodes both have identity points that are subject
> constituting resources, they must be merged if and only if the two
> subject constituting resources are known to the processing system to be
> one and the same resource, regardless of how that resource may have been
> differently addressed. In other words, merging is required if and only
> if the two addresses are known to the processing system to be
> equivalent.
> "All t-nodes have at least one subject indicator resource. (If nothing
> else, a t-node must at least have the syntactic construct that demanded
> its existence as one of its subject indicators.) Two t-nodes that do not
> have subject constituting resources shall be merged if and only if:
> either:
> "one of the two t-nodes has at least one subject indicator resource that
> is known to the processing system to be the same resource that serves as
> one of the subject indicators of the other t-node,
> [PPJ: and it seems to me that if the resource is within a topic map the
> TM1 vs. TM2 regress continues here.]
> or:
> "the two subject indicator resources indicating the subject are known
> (on account of machine intelligence or human intervention) to the
> processing system to describe the same subject.
> [PPJ: which some might say might be passing the buck a tad too much or
> encouraging an approach that has great risks for the integrity of the
> information. But perhaps that would be asking too much of XTM 1.0 here.]
> "For purposes of the Subject-based Merging Rule, it is irrelevant
> whether two subject indicator resources, or two subject constituting
> resources, contain the same data or are the same string. A simple string
> comparison of the two subject indicator resources is not, in the general
> case, a reliable indication of whether or not the same subject is being
> described. For example, different products in different sales catalogs
> may coincidentally have the same catalog number, and a comparison of the
> two catalog numbers does not indicate that they are the same product.
> Therefore, the Subject-based Merging Rule is not based on comparing the
> data content of the resources that serve as identity points. Merging
> must occur if and only if:
> "either both subject identity points are subject indicators, or both
> subject identity points are subject constituters (i.e., they can't be
> mixed), and
> "they are one and the same resource, meaning that they exist in exact
> same addressable context, even though there may be multiple different
> equivalent addressing expressions that can arrive at that same resource
> in that same addressable context."
> [PPJ: But note that this approach does not cure issues with polysemy
> where two topics with the same resource indicator and baseName
> nonetheless have different types. And up the scoping regress we go
> again.]
> All of these three approaches have, imho, arise from a not altogether
> satisfactory limiting of the contextual information in terms of the
> depth of investigation you can make into the properties of a scoping
> topic.
> If in XTM 1.0 you could investigate the properties of such topics,
> instead of getting answers as you investigate more deeply into the
> scoping hierarchy you would get more questions - a negative regress.
> (Hold that thought for a moment.)
> Now, my objections (but see comment below these):
> Objection to 1: is that it prevents the TM system being opened out into
> the web as an open collaborative enterprise.
> Objection to 2: it's buck passing on to another less than satisfactory
> system, because, let's face, agreeing on the words to use for something
> is a perennial human problem.
> Objection to 3: it's buck passing with less accuracy than in 2. You
> can't really nail _that much_ from an address on the web.
> If all I were speaking of were ISO13250 then I would admit that I am
> being far too harsh. The aims of ISO13250 were to index structured
> collections, and merge indexes. Indexing is not knowledge representation
> in the strict sense of both terms.
> So ISO 13250 arguably does what it is supposed to.
> However, it appeared to me that in XTM 1.0 meetings that for some the
> agenda went further, and that attempts were made to extend the game to
> KR.
> And then things became very strained and confusing and messy.
> But now, what I still believe, after much thought, that XTM could be
> converted to do full blown KR, and that it might prove to be a solid
> lingua franca for such applications, and do so globally, and openly on
> the web,
> One would need to remove negative regressions and make topic property
> comparison stricter.
> Then as you investigate up the scoping hierarchy you would get answers,
> not questions, and the answers would run as far up the scheme of things
> as you needed to look to be sure that you were putting something in the
> right place.
> You would have what I am calling 'deep contexts'.
> Now, to come back to the 'splitting into four topics' issue. Given the
> above, I hope you can now see why there is at least an argument (and
> only an argument as yet!) for removing internal scoping from topics and
> restricting baseName cardinality, etc.
> Then, do you agree that if I split your topic into four, each with it's
> own scope, then the only information that is lost, is that they all
> relate to the same binding point.
> My answer to that is: in indexing maybe the binding point is a useful
> shorthand, in KR perhaps it creates too much implicit information that
> would be better held in explicit scoped associations.
> I hope that makes my approach somewhat clearer.
> The discussion and debate, as ever, much appreciated.
> --
> Peter
> ----- Original Message -----
> From: "Murray Altheim" <m.altheim@open.ac.uk>
> To: <ba-ohs-talk@bootstrap.org>
> Sent: Friday, March 22, 2002 5:23 PM
> Subject: Re: [ba-ohs-talk] Re: A modest proposal
> > Peter Jones wrote:
> >
> > > Yes, you would.
> > > But then presumably if you were Swahili and wanted to look up
> giraffes
> > > you would
> > > be searching in Swahili and not in English.
> > > Swahili would be a scope.
> >
> >
> > Yes, exactly.
> >
> >
> > > Then it makes no difference to what the user would see, it only
> enables
> > > deep contexts within the system if that is desirable.
> >
> >
> > But this is how topic maps work. Your use of "deep contexts" I
> > don't quite understand.
> >
> > Now, for reasons I'll not elaborate, let's try an example of this
> > with "elephant" in Zulu instead of a "giraffe" in Swahili (okay,
> > I could locate a speaker of Zulu but not Swahili on short notice,
> > and "giraffe" isn't translated into Japanese or Korean AFAIK). The
> > topic element looks like this:
> >
> >    <topic id="ele34">
> >      <subjectIdentity>
> >        <subjectIndicatorRef
> >          xlink:href="http://www.altheim.com/zoo/elephant.html"/>
> >      </subjectIdentity>
> >      <baseName><!-- Zulu -->
> >        <scope><topicRef xlink:href="../language.xtm#zu"/></scope>
> >        <baseNameString>ndofu</baseNameString>
> >      </baseName>
> >      <baseName><!-- Korean -->
> >        <scope><topicRef xlink:href="../language.xtm#ko"/></scope>
> >        <baseNameString>코끼리</baseNameString>
> >      </baseName>
> >      <baseName><!-- Japanese -->
> >        <scope><topicRef xlink:href="../language.xtm#ja"/></scope>
> >        <baseNameString>象</baseNameString>
> >      </baseName>
> >      <baseName><!-- English -->
> >        <scope><topicRef xlink:href="../language.xtm#en"/></scope>
> >        <baseNameString>elephant</baseNameString>
> >      </baseName>
> >      ...
> >    </topic>
> >
> > I've included "elephant" basenames in Zulu, Korean, Japanese
> > and English. This is completely typical XTM, as shown in the
> > examples in the XTM spec. I'm not sure why you'd want to do
> > anything different, such as divide it into four topics. If
> > we assume that
> >
> >    "http://www.altheim.com/zoo/elephant.html";
> >
> > is an acceptable indicator of the subject "elephant", then all
> > three of the divided topics would appropriately contain a
> > <subjectIdentity> pointing to that URL, and would be merged
> > back into what you see above the first time it was processed
> > through a compliant topic map engine. People in four languages
> > could locate the topic (whose ID is "ele34") by searching in
> > the scope of either of the four languages provided.
> >
> > Murray
> >
> > ......................................................................
> > Murray Altheim                         <mailto:m.altheim @ open.ac.uk>
> > Knowledge Media Institute
> > The Open University, Milton Keynes, Bucks, MK7 6AA, UK
> >
> >       In the evening
> >       The rice leaves in the garden
> >       Rustle in the autumn wind
> >       That blows through my reed hut.  -- Minamoto no Tsunenobu
> >
> >
>    (05)