[ba-unrev-talk] Alternative categorization scheme
"Dennis E. Hamilton" wrote: (01)
> Chris, .... I think you are spot on. The approach to formal ontologies,
> taxonomies, and related classification/nomenclature schemes seem to involve
> straight-jacketing into fixed conceptual frameworks. I think it shows up
> dramatically in cross-cultural as well as cross-language matters. I do not
> know how the semantic web will reconcile this. We get to find out in one of
> the grandest information systems experiments going.
>
> At the same time, based on my experience in computation theory, I am aware
> of schemes that do not require this kind of commitment. And it is not clear
> how that can help. There is something more powerful, but it may not make
> sense or be practicable to expect to exercise it at the textual analysis
> level. Especially if markup is required. (02)
Rich Persaud wrote: (03)
> See also current search engine work from NEC Research:
>
> http://www.nature.com/nsu/020304/020304-8.html
> http://webselforganization.com/example.html
> http://webselforganization.com (04)
Sychronicity! I've just been reading Nexus -- a fascinating non-mathematical
discussion of the properties of networks. The first paper shows they are
taking advantage of their cardinal property -- the fact that they group
themselves into multiple dense clusters. (05)
That inspection algorithm could be the key to finding the right context for
a search. For example, if I search on "shoe installation", the search might
yield multiple clusters, centered around:
* automobile brake shoes
* motorcycle brake shoes
* bicycle brake shoes
* "shoes" in mechanical devices
* shoe stores
* Hanging large wooden signs (06)
Suddenly, "more like this" takes on an entirely new meaning. At Google's
site, "more like this" means "more pages from the same server". But if a
single representative page for each of the above groups were delivered
(say, the most frequently linked page in each group), then "more like this"
would mean, "expand this subgroup and show me all the pages in it". (07)
What's fascinating about that is that network properties can be exploited
to identify categories, *without* having to label them or add metadata to
the web. (08)