He began wondering how he might use technology to solve society's
problems. "The complexity that mankind is facing is getting greater and
greater", he said to himself, "so our problems are getting more complex,
and the time we have to deal with them is getting shorter and more
urgent. Is there anything I can do to contribute to mankind's ability to
cope with compexity and urgency? The more he though about it the more it
moved him. "That," he concluded, "would be a terriffic goal."
In 1998, Edward Wilson wrote (Consilience - The Unity of Knowledge. New
York: Knopf. 332 pp.) that:
"A great deal of serious thinking is needed to navigate the decades
immediately ahead … . only unified learning, universally shared, makes
accurate foresight and wise choice possible. … we are learning the
fundamental principle that ethics is everything."
The strategy for pursuit of this vision will be information based,
knowledge guided, stakeholder driven, and human centered. It will
develop four dimensions of bootstrapping:
* discovery (through research),
* integration (through interdisciplinary collaboration),
* dissemination (through education that becomes life-long learning),
and
* application (through cooperation among academia, business and
industry, the several levels of government, and nongovernmental
organizations)
Wrestling with the complex problems of providing a sustainable world for
future generations is the most crucial challenge society and governments
face today and into the new millennium. Success in developing effective
strategies to confront this formidable challenge will require novel
forms of cooperation and collaboration between and among institutions
across all jurisdictional boundaries. While processes for moving toward
this desired future condition are taking place, their evolution is being
accelerated by the rapidly increasing knowledge in science, engineering
and technology—and by the emergence of increasingly practical ways for
disseminating and using data and information for the creation of
knowledge.
These trends strongly indicate that it is timely for nations of the
world to galvanize the scientific and technological capabilities of
their institutions into concerted action for advancing research and
monitoring for managing their economic, social and ecological systems in
a sustainable manner across continental scales. Working in partnership
to carry out this common task will further galvanize our institutions to
confront successfully the complex sustainability challenges of the new
millennium.
OHS/DKR ITELLIGENT SOFTWARE AGENTS
Intelligent service application software agents must provide tailored,
human-centered data acquisition and processing, data fusion, and
information generation and dissemination to users. These agents act to
deliver processed, synoptic information rather than volumes of data and
images. A function such agents serve is rapid search and discovery of
geographic knowledge. The basis for such search is often geographic
location. The object is to retrieve all data and information concerning
a place. These data and information must be retrieved and organized into
a form from which the user can effectively and efficiently extract
required information. The service application software agents
collaborate with other software agents to achieve general goals set by
users, and based on user profiling, generate pertinent situation changes
that may be of interest to the user. The agents support automatic,
dynamic, adaptive allocation of transport and processing resources, and
replicate as necessary for efficiency and to ensure continuity of
services provided to the user.
Intelligent application software agents must provide an array of
functions appropriate to the user's mission and situation, and exchange
information and status with other application software agents to provide
integrated yet distributed execution of requested user services. These
agents automatically select and perform their functions depending on
specific user requirements and profiled user interest areas. The agents
provide discovery and integration of text, tabular and geospatial data
from multiple, heterogeneous databases, broker between other agents for
sharing of information, and negotiate with service agents to establish
appropriate network and resource allocations to achieve their goals.
These agents are adaptive, in that they profile user needs for
information such as measurements, targets, maps, changes in areas, and
models against direct user input, past user requirements, and an
understanding of user mission, status, and intentions.
Although successful examples of both types of agent exist, there is
general agreement that more investment to strengthen the technology base
is needed before robust agents can be routinely constructed. This is not
trivial. As the nation attempts to integrate DOD and commercial
geospatial data, many important questions remain open.
Needed technologies include:
* Universal language and computational models for declaring agents,
* Representation technology for knowledge and system resources,
* Algorithms and protocols for agent management and interagent
negotiation and information exchange, and
* Automated learning and user-profiling techniques.
INFORMATION CONTENT
From the user's point of view, representation is an essential aspect of
information content. How information is presented, whether to a human
operator or to an automatic analysis system, is often a principal
determinant of its utility. Information can also take on new
significance when organized in useful ways. Ubiquitous, continuous, and
pervasive computing has a particular need for compact representations of
information because of the constraints imposed by the relatively limited
bandwidth available for wireless communications.
INFORMATION SOURCES
The continuing development of inexpensive, powerful processing
capabilities ensures that the coming decades will be marked by ongoing
advances in information technology. Increasing access to advanced
information processing and information management capabilities will lead
to a proliferation of activities that generate, maintain, manage, and
exploit information, and it is certain that the Internet will be one of
the many important players in the new world of information-centered
activities.
The OHS/DKR users need to be in a position to exploit a wide variety of
available information sources. Certain domains of OHS/DKR information
needs are unique and highly specialized, and will require focused
investment to develop the requisite technology. This is particularly
true in the area of software productivity, mapping, charting, and
geodesy. Here, continued R&D and infrastructure upgrades will be
required to produce geospatial data for the OHS/DKR users in a timely
fashion. Other needs, which may be less unique and less specialized,
will bemet by appropriately exploiting sources of information that will
be available in the public domain.
New wireless Internet-accessible sensor systems and the increasing use
of indigenous sensors are emerging from the dramatic growth of the
commercial communications infrastructure, and the data they generate
represent a new class of public-domain information. These systems can be
classified into two categories: commercial systems that will be
developed in order to sell information for profit, and sensors used in
conjunction with information systems for the benefit of the user. The
first category includes commercial satellite imagery, databases and
mailing lists available for purchase, and commercially operated data
mining sources. The second category includes automobile sensors
communicating with a "smart" highway; smart homes providing
communication links between appliances and manufacturers for maintenance
and monitoring; remote camera systems operated by organizations for the
benefit of the public, such as town-square imaging systems accessible
over the World Wide Web; and water measurement sensors that transmit
reservoir fill levels to public water works. Together, these two
categories constitute an enormous body of information that, typically,
will reside within the public domain, and from which it may be possible
to extract, for example, data regarding the location of an individual or
vehicle, or the state of a particular system at any given time. This
type of information will become increasingly available.
Accordingly, it is incumbent on the open source OHS/DKR developers to
position themselves such that they are capable of exploiting this rich
new class of sensors and information. In some cases, directly relevant
information can be purchased or procured. More often, however, the
required information must be inferred from public sources and those
inferences then transformed into a form relevant to OHS/DKR user needs.
For example, publicly accessible town hall sensors and reservoir data
can be used to infer local conditions. Traffic analysis can indicate
levels of activity, and movements of individuals can indicate
deployments. Information on local conditions that can be inferred from
the direct data could be extremely useful when appropriately presented
to a commander or operator. Data mining technologies and collaborative
filtering techniques can be used to deduce information and compact it
succinctly for analysis and presentation.
Indeed, the body of information that will be available can, if properly
exploited, lead to a revolution in the intelligence field, and provide
the data sources for intelligent software agents and automated
inferencing engines that can be crucial to the OHS/DKR user missions.
APPLICATIONS
The utility of information depends, in large measure, on applications
that take raw data as an input, analyze them, and transform them into a
representation that is meaningful to OHS/DKR operators and commanders.
This task is so demanding that, ultimately, a new class of applications
technology will be required, which could be called information
understanding, and which will include a suite of advanced methods for
processing, analyzing, and representing information. Information
understanding could greatly extend the capability of augmenting human
systems, for example, as well as other technical means of gathering
intelligence. Such enhanced capability could be important not only for
cognitive recognition using wireless sensors designed to acquire
information, but also for reasoning about disparate information sources
on a longer time scale, to provide deep understanding and facilitate
planning for potential CoDIAK operations. Traditionally, sensor
information is fed to a processor that performs pattern recognition
functions in order to detect changes. This methodology assumes, however,
that sensor data is a rare and precious commodity that must be processed
immediately. It also assumes that the relevant information is localized
in a sensor stream. In a sensor-rich environment, the timing of the
processing can be matched to the requirements of the application. New
applications for the exploitation of wireless sensor information are
afforded by the ability to consider processing outputs from multiple
disparate sensor sources over longer periods of time.
As mentioned above, certain applications require that decisions must be
made immediately, and so require rapid access to information with
minimal latencies. Other applications make decisions that are based on
information that has a long time constant, and thus might involve
processing times that could involve hours or days or weeks. As such,
these applications can afford the luxury of accessing massive databases.
If the processing must occur in time t, and the bandwidth is B, then the
maximum amount of information available to the application in order to
make a decision will be at most tB. In order to make an intelligent
decision, a certain amount of information is always necessary, and thus
bandwidth requirements are necessarily high for applications that
require timely decisions. However, applications that can be executed at
a more leisurely pace have the opportunity to make
more intelligent decisions by massively increasing the total amount of
information available to the processor, either by virtue of the
additional time, or through large bandwidth capabilities, or both.
Accordingly, requirements for timely decisions impose constraints on the
amount of data that can be accessed, whereas longer-term applications
can access large, distributed, disparate databases and make use of more
intensive intelligent processing. This relationship is illustrated
schematically in Figure 1.
FIGURE 1 Categorization of information applications.
Not only is there a tradeoff between timeliness and the amount of
information accessible to the process, but the kinds of information
sources that are useful will also be affected by the type of
application. The value of some information decays over time, and
applications with long processing times will, in general, only be
utilized for processing information whose value persists over a
reasonable time scale. On the other hand, applications that make
relatively fast decisions will need immediate access to timely
information, and thus will likely be tightly coupled to wireless
Internet-accessable sensor systems.
Indeed, there is an overriding need for awareness of what information is
available and where it can be located, as well as for timeliness and
assurance of the information sources. With such awareness, information
can be matched to the application, and action can be taken in advance to
ensure that the information will be available when needed. Finally, it
is important to be able to perform inferencing, to adapt information to
representations that are useful for OHS/DKR user needs, and to fuse
information from multiple sources. Processing that performs inferencing
and transformation of information is thus required not only to aid the
interpretation, but also for compression.
Considering the proliferation of information sources, and the need to
match sources to the classes of applications, it is incumbent on the
OHS/DKR user to develop an awareness of the available information. In
order to perform these functions and to ensure timely and convenient
access to those sources, responsibility should be designated within the
Distributed DOM for the identification, organization, and classification
of all relevant information sources. Assembling links to information
sources will include awareness of novel information providers, creation
of specific databases, mirroring of certain databases for rapid
accessibility, and vigilance in the maintenance of the quality of the
databases.
INFORMATION UNDERSTANDING
Information understanding involves the fusion of data that may be
spatially and temporally distributed in order to form a coherent picture
of a situation of interest. Information understanding depends on the
ability to recognize and extract relevant data from large and disparate
data collections-extracting useful information from large sets of
redundant, unstructured, and largely irrelevant will often be the first
step in developing information understanding. In the commercial world,
current extraction techniques utilize data mining.
Data mining, currently focuses on the need of credit card companies to
automatically recognize spending patterns that indicate probable fraud,
based not only on current purchases, but also on the extent to which the
current pattern is unusual for the card in question. Other business uses
of data mining and collaborative filtering include profiling of
potential customers based on their spending patterns, so as to target
marketing efforts to the most likely consumers of products and services.
Since the marketplace rewards businesses that can exploit a comparative
advantage, data mining tools for business applications will inevitably
become an important part of mainstream commerce. In medical data
processing, there is the possibility of developing automated diagnostic
procedures that identify conditions or pathology from multiple test
results. OHS/DKR user needs are conceptually similar, but broader and
different in scope-information relevant to global sustainabilty can be
extracted from nearly all information sources. Further, rather than
focusing on securing a competitive advantage in sales and marketing of
goods and services, OHS/DKR user needs include more general
intelligence, indications, and warnings, and other information that can
facilitate planning and execution.
Information understanding technologies to meet OHS/DKR user needs may
draw upon the same underlying theory that supports commercial
information extraction techniques, but generally will require a
different set of applications. Currently, the Internet can be viewed as
a primitive form of information understanding technology, which should
ultimately lead to global analysis and automated situation awareness,
and all-source automated multisensor analysis. The development of these
capabilities will be driven by OHS/DKR user needs and will be
facilitated by advances in sensors, communications, and computation.
RECOGNTION
Recognition theory refers to the body of knowledge underlying the
development of tools for extracting information from large and varied
data sets and is the underlying foundation of those technologies that
are referred to in this report as information understanding. The theory
of pattern recognition, which involves the identification of distinctive
patterns in signal data, is a special case of recognition theory.
Typically, pattern recognition uses a single image or a single return
signal and attempts to distinguish among a fixed collection of
possibilities in order to characterize the given data. More broadly,
recognition theory encompasses systems with greater cognitive processing
capability that are flexible enough to effect recognition in the context
of situations and scenarios that have not been explicitly programmed
into the recognition system. Further, recognition theory
should enable the development of systems that can discover associations
among disparate pieces of information.
Methods developed in the field of artificial intelligence (AI),
including commonsense reasoning, nonmonotonic logic, circumspection,
algorithms used in neural networks, and extensions to Bayesian calculi,
have largely failed to provide the understanding required to develop a
coherent theory of generalized recognition. Accordingly, recognition
does not yet exist as a differentiated discipline. However, given the
ongoing progress in AI research, the panel anticipates that a coherent
theory of recognition will emerge. Further research and development is
needed to develop the capacity to reason in the face uncertainty and to
fuse information from disparate sources.
Automatic situation-awareness recognition uses recognition theory in
limited ways. Most automatic situation-awareness development is
currently limited to the pattern recognition subset of recognition
theory, being based on analysis of single image frames and segmented
target regions. However, more generalized automatic situation-awareness
processing would take advantage of multiple geo-registered information
sources and temporally displace data in order to dynamically reason
about situations.
One of the main differences between the theory of pattern recognition
and more general recognition theory is summed up in the standard
distinction between bottom-up and top-down processing. Recognition
theory seeks a solution to the problem of identifying and extracting
information that is relevant to a particular working hypothesis from
large and highly varied sets of data. Since extraction and analysis are
driven by a hypothesis, recognition theory can be viewed as largely
top-down processing. Currently, most recognition systems work in a
bottom-up fashion, first extracting features from the given sensor data,
and then looking for patterns among the features that support a model
hypothesis. Although hypotheses are formed in the course of executing
pattern recognition, it is the sensory data that largely dictates the
flow of processing, and bottom-up processing is the
more appropriate description for the information flow. When data sets
become too large to carry out bottom-up processing, and when information
must be extracted from multiple and highly varied sources, processing
methods necessarily must use analogs of inverse indices and top-down
processing.
THE FUTURE INFORMATION ENVIRONMENT
The OHS/DKR team assumes a future in which sensors and information will
be ubiquitous. Encryption will be used to protect certain vital
information, such as bank transactions, but massive amounts of other
information will be available for analysis. Not only will personal and
official messages be passed digitally, but every appliance will also be
communicating by networks with remote controllers, and every individual
can be expected to be in constant contact with a vast interconnected
digital network. Highway tolls will be paid electronically, and packets
containing information as to the whereabouts of any moving private
vehicle will likely be available. This sea of information will include
data about individuals from government and commercial sources. It is
reasonable to assume that the whereabouts, movement, purpose, and plans
of most individuals will be discernible from an analysis of specialized
information, and that most businesses and companies will have massive
incentives to perform such analyses in order to target their marketing
to the appropriate potential customers. Although encryption of the
information may afford some privacy to individuals, analysis of data
traffic patterns may provide nearly equivalent information, at least in
a statistical sense. To the extent that information can be captured, it
can also be archived, and it is anticipated that a massive, distributed,
dynamic database of archived information will be developed specifically
for the collective OHS/DKR user needs. The technology that will be
developed to analyze and exploit the sea of information that will be
available in the future will pose both challenges and opportunities for
the OHS/DKR evolution.
ADVANCES NEEDED TO SUPPORT INFORMATION UNDERSTANDING
While much of the research that is required for the development of
technologies to support information understanding is currently ongoing,
it is not sufficiently focused on developing information understanding
applications for augmenting human capabilities.
The following six technology areas as meriting special attention in
order to realize the information understanding capabilities that will be
required to analyze and exploit the sea of information that will
characterize the future information environment:
1. Information representation.
Information representation involves extracting and representing features
from data streams in such a way that the relevant information can be
accessed efficiently from automated queries. Methods of information
retrieval likely will include inverse indices and distributed processing
using intelligent memory. It will be necessary to develop the means to
appropriately represent information without prior knowledge of the
likely hypotheses that might later be used to extract the representation
or to associate other data with it.
2. Information reasoning.
Information reasoning involves the capacity to reason in the face of
uncertainty, and may include the use of models to predict degrees of
dependence and independence between data sets and other strategies in
order to effectively hypothesize and test premises for the purpose of
extracting relevant information content. Further advances in recognition
theory are needed, including methods for combining data and forming
inferences. Recent developments in neural network theory suggest that it
may be possible to create adaptive reasoning systems, but further
advances are required before such systems can be realized.
3. Information search.
Since information understanding will most likely work with a top-down
structure, methods are needed to organize hypotheses hierarchically, in
order to structure the search for content logically and efficiently. In
the same way that model-based systems generate hypotheses that are
verified and refined in a tree-search structure, analogs are needed to
organize the search for information content. Further, the search cannot
be hand-crafted for each recognition system application. Instead,
methods are needed to automatically generate the search trees and
hypothesis organization strategies.
4. Information integrity.
Because data might be corrupted, faked, or inaccurate, not all
information sources should be trusted equally. While technologies exist
for authenticating information and securing its transfer, means of
assessing confidence in information sources, and the ability to discard
untrustworthy information, are topics that need further development.
5. Information presentation.
Information presentation, as opposed to representation, is the manner in
which processed data is supplied to the human operator or commander.
This involves the human-machine interface as well as the specific manner
in which the data is displayed and its context established. Capabilities
for data visualization and multimedia presentation of information will
be important for the best performance of an information understanding
system that necessarily includes a human operator as an integral
subcomponent of the system.
6.Human-performance prediction.
An information understanding system that includes the human operator as
the final arbitrator and decisionmaker can be effective only if the
human-machine interface is optimized with respect to human performance
in the context of the task at hand. Accordingly, it will be necessary to
acquire greater understanding of human cognition and decisionmaking
behavior.
Because information understanding is a cross-cutting endeavor, other
technology enablers in addition to those listed above will play a role
in its realization. For example, wireless networking technology,
including data transfer and connectivity standards, will be an important
factor.
This archive was generated by hypermail 2b29 : Sun May 21 2000 - 01:13:35 PDT