Re: [unrev-II] OHS/DKR Intelligent Software Agents, Content, Sources, and Bootstrap Understanding & Recognition

From: John J. Deneen (
Date: Sun May 21 2000 - 01:06:20 PDT

  • Next message: Henry van Eyken: "Re: [unrev-II] Testing ... testing ..."


    He began wondering how he might use technology to solve society's
    problems. "The complexity that mankind is facing is getting greater and
    greater", he said to himself, "so our problems are getting more complex,
    and the time we have to deal with them is getting shorter and more
    urgent. Is there anything I can do to contribute to mankind's ability to
    cope with compexity and urgency? The more he though about it the more it
    moved him. "That," he concluded, "would be a terriffic goal."

    In 1998, Edward Wilson wrote (Consilience - The Unity of Knowledge. New
    York: Knopf. 332 pp.) that:

     "A great deal of serious thinking is needed to navigate the decades
    immediately ahead … . only unified learning, universally shared, makes
    accurate foresight and wise choice possible. … we are learning the
    fundamental principle that ethics is everything."

    The strategy for pursuit of this vision will be information based,
    knowledge guided, stakeholder driven, and human centered. It will
    develop four dimensions of bootstrapping:

       * discovery (through research),
       * integration (through interdisciplinary collaboration),
       * dissemination (through education that becomes life-long learning),
       * application (through cooperation among academia, business and
         industry, the several levels of government, and nongovernmental

    Wrestling with the complex problems of providing a sustainable world for
    future generations is the most crucial challenge society and governments
    face today and into the new millennium. Success in developing effective
    strategies to confront this formidable challenge will require novel
    forms of cooperation and collaboration between and among institutions
    across all jurisdictional boundaries. While processes for moving toward
    this desired future condition are taking place, their evolution is being
    accelerated by the rapidly increasing knowledge in science, engineering
    and technology—and by the emergence of increasingly practical ways for
    disseminating and using data and information for the creation of

     These trends strongly indicate that it is timely for nations of the
    world to galvanize the scientific and technological capabilities of
    their institutions into concerted action for advancing research and
    monitoring for managing their economic, social and ecological systems in
    a sustainable manner across continental scales. Working in partnership
    to carry out this common task will further galvanize our institutions to
    confront successfully the complex sustainability challenges of the new


    Intelligent service application software agents must provide tailored,
    human-centered data acquisition and processing, data fusion, and
    information generation and dissemination to users. These agents act to
    deliver processed, synoptic information rather than volumes of data and
    images. A function such agents serve is rapid search and discovery of
    geographic knowledge. The basis for such search is often geographic
    location. The object is to retrieve all data and information concerning
    a place. These data and information must be retrieved and organized into
    a form from which the user can effectively and efficiently extract
    required information. The service application software agents
    collaborate with other software agents to achieve general goals set by
    users, and based on user profiling, generate pertinent situation changes
    that may be of interest to the user. The agents support automatic,
    dynamic, adaptive allocation of transport and processing resources, and
    replicate as necessary for efficiency and to ensure continuity of
    services provided to the user.

    Intelligent application software agents must provide an array of
    functions appropriate to the user's mission and situation, and exchange
    information and status with other application software agents to provide
    integrated yet distributed execution of requested user services. These
    agents automatically select and perform their functions depending on
    specific user requirements and profiled user interest areas. The agents
    provide discovery and integration of text, tabular and geospatial data
    from multiple, heterogeneous databases, broker between other agents for
    sharing of information, and negotiate with service agents to establish
    appropriate network and resource allocations to achieve their goals.
    These agents are adaptive, in that they profile user needs for
    information such as measurements, targets, maps, changes in areas, and
    models against direct user input, past user requirements, and an
    understanding of user mission, status, and intentions.

    Although successful examples of both types of agent exist, there is
    general agreement that more investment to strengthen the technology base
    is needed before robust agents can be routinely constructed. This is not
    trivial. As the nation attempts to integrate DOD and commercial
    geospatial data, many important questions remain open.

    Needed technologies include:

       * Universal language and computational models for declaring agents,
       * Representation technology for knowledge and system resources,
       * Algorithms and protocols for agent management and interagent
         negotiation and information exchange, and
       * Automated learning and user-profiling techniques.


    From the user's point of view, representation is an essential aspect of
    information content. How information is presented, whether to a human
    operator or to an automatic analysis system, is often a principal
    determinant of its utility. Information can also take on new
    significance when organized in useful ways. Ubiquitous, continuous, and
    pervasive computing has a particular need for compact representations of
    information because of the constraints imposed by the relatively limited
    bandwidth available for wireless communications.


    The continuing development of inexpensive, powerful processing
    capabilities ensures that the coming decades will be marked by ongoing
    advances in information technology. Increasing access to advanced
    information processing and information management capabilities will lead
    to a proliferation of activities that generate, maintain, manage, and
    exploit information, and it is certain that the Internet will be one of
    the many important players in the new world of information-centered

    The OHS/DKR users need to be in a position to exploit a wide variety of
    available information sources. Certain domains of OHS/DKR information
    needs are unique and highly specialized, and will require focused
    investment to develop the requisite technology. This is particularly
    true in the area of software productivity, mapping, charting, and
    geodesy. Here, continued R&D and infrastructure upgrades will be
    required to produce geospatial data for the OHS/DKR users in a timely
    fashion. Other needs, which may be less unique and less specialized,
    will bemet by appropriately exploiting sources of information that will
    be available in the public domain.

    New wireless Internet-accessible sensor systems and the increasing use
    of indigenous sensors are emerging from the dramatic growth of the
    commercial communications infrastructure, and the data they generate
    represent a new class of public-domain information. These systems can be
    classified into two categories: commercial systems that will be
    developed in order to sell information for profit, and sensors used in
    conjunction with information systems for the benefit of the user. The
    first category includes commercial satellite imagery, databases and
    mailing lists available for purchase, and commercially operated data
    mining sources. The second category includes automobile sensors
    communicating with a "smart" highway; smart homes providing
    communication links between appliances and manufacturers for maintenance
    and monitoring; remote camera systems operated by organizations for the
    benefit of the public, such as town-square imaging systems accessible
    over the World Wide Web; and water measurement sensors that transmit
    reservoir fill levels to public water works. Together, these two
    categories constitute an enormous body of information that, typically,
    will reside within the public domain, and from which it may be possible
    to extract, for example, data regarding the location of an individual or
    vehicle, or the state of a particular system at any given time. This
    type of information will become increasingly available.

    Accordingly, it is incumbent on the open source OHS/DKR developers to
    position themselves such that they are capable of exploiting this rich
    new class of sensors and information. In some cases, directly relevant
    information can be purchased or procured. More often, however, the
    required information must be inferred from public sources and those
    inferences then transformed into a form relevant to OHS/DKR user needs.
    For example, publicly accessible town hall sensors and reservoir data
    can be used to infer local conditions. Traffic analysis can indicate
    levels of activity, and movements of individuals can indicate
    deployments. Information on local conditions that can be inferred from
    the direct data could be extremely useful when appropriately presented
    to a commander or operator. Data mining technologies and collaborative
    filtering techniques can be used to deduce information and compact it
    succinctly for analysis and presentation.

    Indeed, the body of information that will be available can, if properly
    exploited, lead to a revolution in the intelligence field, and provide
    the data sources for intelligent software agents and automated
    inferencing engines that can be crucial to the OHS/DKR user missions.


    The utility of information depends, in large measure, on applications
    that take raw data as an input, analyze them, and transform them into a
    representation that is meaningful to OHS/DKR operators and commanders.
    This task is so demanding that, ultimately, a new class of applications
    technology will be required, which could be called information
    understanding, and which will include a suite of advanced methods for
    processing, analyzing, and representing information. Information
    understanding could greatly extend the capability of augmenting human
    systems, for example, as well as other technical means of gathering
    intelligence. Such enhanced capability could be important not only for
    cognitive recognition using wireless sensors designed to acquire
    information, but also for reasoning about disparate information sources
    on a longer time scale, to provide deep understanding and facilitate
    planning for potential CoDIAK operations. Traditionally, sensor
    information is fed to a processor that performs pattern recognition
    functions in order to detect changes. This methodology assumes, however,
    that sensor data is a rare and precious commodity that must be processed
    immediately. It also assumes that the relevant information is localized
    in a sensor stream. In a sensor-rich environment, the timing of the
    processing can be matched to the requirements of the application. New
    applications for the exploitation of wireless sensor information are
    afforded by the ability to consider processing outputs from multiple
    disparate sensor sources over longer periods of time.

    As mentioned above, certain applications require that decisions must be
    made immediately, and so require rapid access to information with
    minimal latencies. Other applications make decisions that are based on
    information that has a long time constant, and thus might involve
    processing times that could involve hours or days or weeks. As such,
    these applications can afford the luxury of accessing massive databases.
    If the processing must occur in time t, and the bandwidth is B, then the
    maximum amount of information available to the application in order to
    make a decision will be at most tB. In order to make an intelligent
    decision, a certain amount of information is always necessary, and thus
    bandwidth requirements are necessarily high for applications that
    require timely decisions. However, applications that can be executed at
    a more leisurely pace have the opportunity to make
    more intelligent decisions by massively increasing the total amount of
    information available to the processor, either by virtue of the
    additional time, or through large bandwidth capabilities, or both.
    Accordingly, requirements for timely decisions impose constraints on the
    amount of data that can be accessed, whereas longer-term applications
    can access large, distributed, disparate databases and make use of more
    intensive intelligent processing. This relationship is illustrated
    schematically in Figure 1.

              FIGURE 1 Categorization of information applications.

    Not only is there a tradeoff between timeliness and the amount of
    information accessible to the process, but the kinds of information
    sources that are useful will also be affected by the type of
    application. The value of some information decays over time, and
    applications with long processing times will, in general, only be
    utilized for processing information whose value persists over a
    reasonable time scale. On the other hand, applications that make
    relatively fast decisions will need immediate access to timely
    information, and thus will likely be tightly coupled to wireless
    Internet-accessable sensor systems.

    Indeed, there is an overriding need for awareness of what information is
    available and where it can be located, as well as for timeliness and
    assurance of the information sources. With such awareness, information
    can be matched to the application, and action can be taken in advance to
    ensure that the information will be available when needed. Finally, it
    is important to be able to perform inferencing, to adapt information to
    representations that are useful for OHS/DKR user needs, and to fuse
    information from multiple sources. Processing that performs inferencing
    and transformation of information is thus required not only to aid the
    interpretation, but also for compression.

    Considering the proliferation of information sources, and the need to
    match sources to the classes of applications, it is incumbent on the
    OHS/DKR user to develop an awareness of the available information. In
    order to perform these functions and to ensure timely and convenient
    access to those sources, responsibility should be designated within the
    Distributed DOM for the identification, organization, and classification
    of all relevant information sources. Assembling links to information
    sources will include awareness of novel information providers, creation
    of specific databases, mirroring of certain databases for rapid
    accessibility, and vigilance in the maintenance of the quality of the


    Information understanding involves the fusion of data that may be
    spatially and temporally distributed in order to form a coherent picture
    of a situation of interest. Information understanding depends on the
    ability to recognize and extract relevant data from large and disparate
    data collections-extracting useful information from large sets of
    redundant, unstructured, and largely irrelevant will often be the first
    step in developing information understanding. In the commercial world,
    current extraction techniques utilize data mining.

    Data mining, currently focuses on the need of credit card companies to
    automatically recognize spending patterns that indicate probable fraud,
    based not only on current purchases, but also on the extent to which the
    current pattern is unusual for the card in question. Other business uses
    of data mining and collaborative filtering include profiling of
    potential customers based on their spending patterns, so as to target
    marketing efforts to the most likely consumers of products and services.
    Since the marketplace rewards businesses that can exploit a comparative
    advantage, data mining tools for business applications will inevitably
    become an important part of mainstream commerce. In medical data
    processing, there is the possibility of developing automated diagnostic
    procedures that identify conditions or pathology from multiple test
    results. OHS/DKR user needs are conceptually similar, but broader and
    different in scope-information relevant to global sustainabilty can be
    extracted from nearly all information sources. Further, rather than
    focusing on securing a competitive advantage in sales and marketing of
    goods and services, OHS/DKR user needs include more general
    intelligence, indications, and warnings, and other information that can
    facilitate planning and execution.

    Information understanding technologies to meet OHS/DKR user needs may
    draw upon the same underlying theory that supports commercial
    information extraction techniques, but generally will require a
    different set of applications. Currently, the Internet can be viewed as
    a primitive form of information understanding technology, which should
    ultimately lead to global analysis and automated situation awareness,
    and all-source automated multisensor analysis. The development of these
    capabilities will be driven by OHS/DKR user needs and will be
    facilitated by advances in sensors, communications, and computation.


    Recognition theory refers to the body of knowledge underlying the
    development of tools for extracting information from large and varied
    data sets and is the underlying foundation of those technologies that
    are referred to in this report as information understanding. The theory
    of pattern recognition, which involves the identification of distinctive
    patterns in signal data, is a special case of recognition theory.
    Typically, pattern recognition uses a single image or a single return
    signal and attempts to distinguish among a fixed collection of
    possibilities in order to characterize the given data. More broadly,
    recognition theory encompasses systems with greater cognitive processing
    capability that are flexible enough to effect recognition in the context
    of situations and scenarios that have not been explicitly programmed
    into the recognition system. Further, recognition theory
    should enable the development of systems that can discover associations
    among disparate pieces of information.

    Methods developed in the field of artificial intelligence (AI),
    including commonsense reasoning, nonmonotonic logic, circumspection,
    algorithms used in neural networks, and extensions to Bayesian calculi,
    have largely failed to provide the understanding required to develop a
    coherent theory of generalized recognition. Accordingly, recognition
    does not yet exist as a differentiated discipline. However, given the
    ongoing progress in AI research, the panel anticipates that a coherent
    theory of recognition will emerge. Further research and development is
    needed to develop the capacity to reason in the face uncertainty and to
    fuse information from disparate sources.

    Automatic situation-awareness recognition uses recognition theory in
    limited ways. Most automatic situation-awareness development is
    currently limited to the pattern recognition subset of recognition
    theory, being based on analysis of single image frames and segmented
    target regions. However, more generalized automatic situation-awareness
    processing would take advantage of multiple geo-registered information
    sources and temporally displace data in order to dynamically reason
    about situations.

    One of the main differences between the theory of pattern recognition
    and more general recognition theory is summed up in the standard
    distinction between bottom-up and top-down processing. Recognition
    theory seeks a solution to the problem of identifying and extracting
    information that is relevant to a particular working hypothesis from
    large and highly varied sets of data. Since extraction and analysis are
    driven by a hypothesis, recognition theory can be viewed as largely
    top-down processing. Currently, most recognition systems work in a
    bottom-up fashion, first extracting features from the given sensor data,
    and then looking for patterns among the features that support a model
    hypothesis. Although hypotheses are formed in the course of executing
    pattern recognition, it is the sensory data that largely dictates the
    flow of processing, and bottom-up processing is the
    more appropriate description for the information flow. When data sets
    become too large to carry out bottom-up processing, and when information
    must be extracted from multiple and highly varied sources, processing
    methods necessarily must use analogs of inverse indices and top-down


    The OHS/DKR team assumes a future in which sensors and information will
    be ubiquitous. Encryption will be used to protect certain vital
    information, such as bank transactions, but massive amounts of other
    information will be available for analysis. Not only will personal and
    official messages be passed digitally, but every appliance will also be
    communicating by networks with remote controllers, and every individual
    can be expected to be in constant contact with a vast interconnected
    digital network. Highway tolls will be paid electronically, and packets
    containing information as to the whereabouts of any moving private
    vehicle will likely be available. This sea of information will include
    data about individuals from government and commercial sources. It is
    reasonable to assume that the whereabouts, movement, purpose, and plans
    of most individuals will be discernible from an analysis of specialized
    information, and that most businesses and companies will have massive
    incentives to perform such analyses in order to target their marketing
    to the appropriate potential customers. Although encryption of the
    information may afford some privacy to individuals, analysis of data
    traffic patterns may provide nearly equivalent information, at least in
    a statistical sense. To the extent that information can be captured, it
    can also be archived, and it is anticipated that a massive, distributed,
    dynamic database of archived information will be developed specifically
    for the collective OHS/DKR user needs. The technology that will be
    developed to analyze and exploit the sea of information that will be
    available in the future will pose both challenges and opportunities for
    the OHS/DKR evolution.


    While much of the research that is required for the development of
    technologies to support information understanding is currently ongoing,
    it is not sufficiently focused on developing information understanding
    applications for augmenting human capabilities.

    The following six technology areas as meriting special attention in
    order to realize the information understanding capabilities that will be
    required to analyze and exploit the sea of information that will
    characterize the future information environment:

    1. Information representation.
    Information representation involves extracting and representing features
    from data streams in such a way that the relevant information can be
    accessed efficiently from automated queries. Methods of information
    retrieval likely will include inverse indices and distributed processing
    using intelligent memory. It will be necessary to develop the means to
    appropriately represent information without prior knowledge of the
    likely hypotheses that might later be used to extract the representation
    or to associate other data with it.

    2. Information reasoning.
    Information reasoning involves the capacity to reason in the face of
    uncertainty, and may include the use of models to predict degrees of
    dependence and independence between data sets and other strategies in
    order to effectively hypothesize and test premises for the purpose of
    extracting relevant information content. Further advances in recognition
    theory are needed, including methods for combining data and forming
    inferences. Recent developments in neural network theory suggest that it
    may be possible to create adaptive reasoning systems, but further
    advances are required before such systems can be realized.

    3. Information search.
    Since information understanding will most likely work with a top-down
    structure, methods are needed to organize hypotheses hierarchically, in
    order to structure the search for content logically and efficiently. In
    the same way that model-based systems generate hypotheses that are
    verified and refined in a tree-search structure, analogs are needed to
    organize the search for information content. Further, the search cannot
    be hand-crafted for each recognition system application. Instead,
    methods are needed to automatically generate the search trees and
    hypothesis organization strategies.

    4. Information integrity.
    Because data might be corrupted, faked, or inaccurate, not all
    information sources should be trusted equally. While technologies exist
    for authenticating information and securing its transfer, means of
    assessing confidence in information sources, and the ability to discard
    untrustworthy information, are topics that need further development.

    5. Information presentation.
    Information presentation, as opposed to representation, is the manner in
    which processed data is supplied to the human operator or commander.
    This involves the human-machine interface as well as the specific manner
    in which the data is displayed and its context established. Capabilities
    for data visualization and multimedia presentation of information will
    be important for the best performance of an information understanding
    system that necessarily includes a human operator as an integral
    subcomponent of the system.

    6.Human-performance prediction.
    An information understanding system that includes the human operator as
    the final arbitrator and decisionmaker can be effective only if the
    human-machine interface is optimized with respect to human performance
    in the context of the task at hand. Accordingly, it will be necessary to
    acquire greater understanding of human cognition and decisionmaking

    Because information understanding is a cross-cutting endeavor, other
    technology enablers in addition to those listed above will play a role
    in its realization. For example, wireless networking technology,
    including data transfer and connectivity standards, will be an important

    This archive was generated by hypermail 2b29 : Sun May 21 2000 - 01:13:35 PDT