Академический Документы
Профессиональный Документы
Культура Документы
Ramesh Jain
Electrical and Computer Engineering, and
College of Computing
Georgia Institute of Technology,
Atlanta, GA 30332-0250
jain@ece.gatech.edu
Abstract
Data has changed significantly over the last few decades. Computing systems that
initially dealt with data and computation rapidly moved to information and
communication. The next step on the evolutionary scale is insight and experience.
Applications are demanding the use of live, spatio-temporal, heterogeneous data. Data
engineering must keep pace by designing experiential environments that let users apply
their senses to observe data and information about an event and to interact with aspects of
the event that are of particular interest. I call this out-of-the-box data engineering because
it means we must think beyond many of our timeworn perspectives and technical
approaches.
1.0 Introduction
Data engineering has evolved and continues to evolve faster than most people ever
imagined. While computing in the early 1970s had only alphabet and numbers,
technology now furnishes its users with an unprecedented volume and variety of
data—from encyclopedia pages to clips of the latest music, from a spreadsheet to a real-
time recording of a triple bypass. And access methods and requirements are evolving at
the same pace.
To keep pace with the demand for live, spatio-temporal, heterogeneous data, data
engineering must let go of old paradigms. They have outlived their application. It is time
to think out of the box—to consider what the operating environments of these new
systems should look like. How can we build something that is experiential, not
information-centric?
Equally interesting is that user expectations of the data system have changed more
rapidly than the data itself. To keep up with these changes, we must consider what the
operating environments of future systems would be and how to realize those
environments rather than how to accommodate new functionality in our existing
paradigm.
In this paper, I look at the changing nature of applications by considering a few novel
applications that use large volumes of data and then discuss the functionality expected
from these systems. That computing systems have evolved to follow user demand and
application development is an important insight in this discussion: applications initially
focused on data and computation, then information and communication, and now insight
and experience. Most techniques in data engineering were developed to meet the needs of
data systems of the last quarter of the 20th century. Data engineers must now address the
needs of this century.
2.2 EventWeb
Web search engines, for example, are notorious for their lack of discrimination. XML has
not solved these problems because ultimately no search engine can anticipate a user’s
exact needs. The semantic Web is receiving a lot of press and people are pinning many
expectations on personal agents to help find the right information and services.
I’m not convinced that this will solve the problem. The semantic Web still follows
the legacy of Gutenberg. It is a web of “pages” that are predominantly prepared in a
document mode. Again, this is fine if all you want are descriptions. But we can do so
much more. Visualize instead a web of events, in which each node represents an event,
past or present. Each event is not just someone’s description of the event or some
statistics related to it. It is the event, brought to you by one or more cameras,
microphones, infrared sensors, or any other technology that lets you experience the event.
For each event, all the information from sensors, documents, and other sources is united
and presented to the user independently of the source. The user then experiences the
preferred parts of a particular event in the preferred medium.
In this vision, events are treated equally. The archived video of a news event is
accessible in the same way as the CEO’s Web cast or your son’s first football game. The
source can be anything from CNN to the local elementary school—whatever or whoever
generates events worth archiving. And perhaps most important, because it is not text
centered, the event Web will reach the 90% of humanity who either cannot grasp or
cannot access current information and communication technology.
I see the rudiments of this vision already. Stores are stocking Web cams in every
shape and size at prices that even students can afford. Sensors that were once discrete are
now being connected to form networks for various Internet applications, from a sushi bar
in San Francisco to an ant colony in Lansing, Michigan. Multimedia phones with built in
cameras will be next. In short, we are witnessing the beginnings of the event Web
explosion; just as decades ago we saw the document Web transform.
all sensors to provide a unified model of the situation. Users should not see the situation
as raw data from different sensors, but as the evolving big picture of the situation. Thus,
in an emergency situation, users see not just isolated sensor streams from different
locations, but situation characterized as needing medical help, fire engines, or police. In
all these applications, users are interested in the real-world situation from a user’s
perspective not the data from a specific source. Sensor data is but one of several sources
that form the model of the situation.
Data sources are broadly of two types, precise and imprecise, and user requirements for
the data fall into roughly two categories, insight and information. The matrix in Figure 1
captures the tensions between these four characteristics. In many situations, I know the
data source precisely, even though it may be distributed. In other cases, I know only that
what I need is available from somewhere. Likewise, in some situations, I am trying to
gain insights into the behavior of a system, event, or concept, so my primary need is to
explore and understand. In other situations, I need information and I want a specific
answer.
Predictably, databases are in the intersection of precise and information, the
bottom left quadrant. Nothing beats them as a means of getting information from a
precise source. In the top left quadrant are visualization environments and tools,
promising ways to gain insight from a precise source. In the bottom right quadrant are
search engines. Few people will argue that search engines are an imprecise source.
However, their intention is to provide information, not further exploration. Unfortunately,
exploration does occur, but it usually to find a suitable match for the query, not to explore
an event further. Finally, in the top right quadrant is the intersection of insight and
imprecise source. This intersection produces what I call the experiential environment, a
new way of presenting data that will become increasingly common in most data-intensive
applications. This will then improve techniques in the other three quadrants.
Insight Experiential
Visualization
Environments
Information
Current Databases Search Engines
and logical operations millions of times faster than any person, but their perceptual
capabilities—even after all the progress of the last 40 years—remain relatively primitive.
Yet current databases present sequential and logical information to humans and
expect computers to detect complex patterns. The powerful synergy of human and
machine is short-circuited. If we use computers and users synergistically, we can develop
the experiential environment.
hourglass, the bar that takes an agonizing amount of time to fill, or the endless flitting
pages are the only indications that the system hasn’t completely abandoned its task. Some
Web sites try to reveal the number of bytes left, which is marginally useful as long as
traffic allows. Nothing, however, will induce users to explore if it takes too long to move
from place to place. When latency is low, on the other hand, exploration is much more
pleasurable. Video games are an example. Their appeal is due in part to their near-zero
latency.
interested in objects. How to get to those objects is a problem that most data engineers
are ignoring. It is true for most other signals in medical, seismic, and other applications
also. Signals are usually indexed using features that capture global or semi-global
features, while semantics usually requires the structure of local features.
Unfortunately, these techniques are still immature, primarily because researchers
are interested in developing general-purpose techniques rather than restricting their
system to a specific context. Researchers can learn from the success of natural language
or speech recognition systems—all successful systems work in a specific context.
Figure 2. Different data sources have different indexing mechanisms, but these sources
live in their own silos.
The challenge to the database community, then, is to break down these silos to unify
information. This requires more out-of-the-box thinking because most data sources are
designed to behave like independent silos. Their creators assume that after the integration
system analyzes the silos and extracts their metadata, it will somehow combine the
metadata to provide correct results. Indeed, many current research efforts are aimed at
this kind of solution.
Researchers also form strong silos. I know from experience in many research areas,
including image and video database research, that tunnel vision is common. Just as the
six blind men had vastly different ideas about the size, shape, and function of an elephant,
so the database, computer vision, and information retrieval communities have diverse
(and equally stubborn) views of an image database. Having all these people develop
systems without communicating is no more productive than having five students in
separate rooms attempt to produce a coherent thesis.
Perhaps the challenge is to break down both kinds of silos.
DataSources
Eventbase
Figure 3. Event graphs unify different data sources by providing a semantic indexing and
linking approach.
Users can select one or two event classes or navigate through class ontology hierarchies
endlessly; there is no theoretical limit on the subclass structure. The depth of the
hierarchy depends on the model used in the application and the data available. Selecting a
class automatically selects all its subclasses. To navigate through event location and time,
users either zoom or move in different directions, similar to the way video game players
select parts of a map, from a room to an entire world. The event timeline could be
anything from microseconds to centuries, or even light years.
At all points of the search the user experiences What-You-See-Is-What-You-Get
(WYSIWYG) characteristics. Once a user selects event classes, part of the map, and time,
the system presents all events and their selected attributes at all three places. In the figure,
the user has selected the inventory class for SBU accessories worldwide in 2001, which is
akin to the text query, “Show me inventory status of all the SBU accessories worldwide
in 2001. The event list (bottom half of the screen) shows the details of the inventories.
The colored dots on the map show the location and status of the inventory: needs
immediate attention, needs some attention, okay. To avoid confusion, this example does
not show a color-code list and timeline, but if the user selects an item in the list, the
display will change color to highlight that selection and its corresponding symbols in the
time and location areas. The exact mix of color and symbols depends on the application.
By displaying events on a map as well as on a timeline, the WYSIWYG approach
maintains event context. The user can then refine the search through any window, say by
zooming into and out of the map or timeline or going left or right. A change in one part
automatically updates results in the other windows. Consequently, the query and
presentation spaces remain equal. Also, as users change the search criteria, they get
immediate results with minimal latency. In most applications, the results can be
instantaneous. Users can experiment with the data set on their own terms and develop
insights at their own pace, always with the event’s context. The system displays results
continuously, making it easier to hypothesize about data relationships. It will be possible
to test a hypothesis by linking such a system to data-mining tools that would let the user
explore large data warehouses.
If a user wants to know more about the event, he can explore it by double clicking in
any of the three windows (what, when, or where). The system then provides all the data
sources (audio, video, or text) and any other event characteristics packaged in an event
envelope, which the system automatically generates on the basis of the information
assimilated in the event base. The system can automatically create event envelopes using
the information in the event base as it is created. The user can launch a variety of sources
from the envelope, and they will open in the desired mode in either a different window
than the user originally selected or in the same window. The system accesses and
appropriately presents much of the information in the event envelope through links to
original sources, such as programs launched to present results of a particular dataset or a
simulation.
An event envelope is a powerful mechanism that unifies the results of many complex
operations. If selected variables have dynamic attributes, the event envelope can present
historical attributes for those variables. Users can then save an event envelope as a snap
shot—the particular state of an event—and compare it with other snapshots representing
later states. The snapshot button (top right in Figure 3) lists all event envelopes the user
has saved. An event envelope can be sent through e-mail and hence can help build
communities around specific themes. Amateur astronomers are interested in scanning the
sky for near-Earth-objects like comets. Clubs could exchange event envelopes and
commentary about the images in the same e-mails. Moreover, the envelopes would
contain links to details like magnitude and angular distance from a known star. This kind
of rich communication increases both individual and community knowledge.
7.0 Applications
Here we briefly describe three applications to give an idea of what could be done in
this environment.
5.1 Football
n Game Drive
n Quarter 1 TD
ß Team A:
Drive FG
ß Team B:
Drive
n Quarter 2 3rd 4th
1st 2nd
ß Team A: Dow Dow Dow Dow
n
Drive n n n
ß Team B:
Drive TO
n Quarter 3
ß Team A:
Drive
ß Team B:
Drive
n Quarter 4
Figure 5: Event graph of a football game
ß Team A:
Drive
The graph in Figure 5 is of the events in a football game. The text shows several levels
ß Team B: from the complete game to a particular drive. For simplicity, the
of event hierarchies,
figure does not show levels below a drive. The graph also shows potential transitions
Drive
from one event to the other in the game in terms of downs. Thus, the graph represents a
subset of an event-transition diagram for the game.
The model is generic for the game. The sequence of events generated depends on the
particular game. Figure 4 shows only show a small subset of the event model to give a
flavor of application.
Our data sources for the game included video (plus audio) from multiple cameras;
play-by-play information, which various companies generated and made available as a
data stream; and a player and statistics database. A rule-based system decided if a
particular play (an event) would be of interest to anyone and thus whether or not to save
the related video.
The system parsed the play-by-play data stream, applied the rule base, and prepared an
event base for the game. As Figure 6 shows, the event base appeared to the user as a
“time machine,” in that users could go to any moment and see all the related statistics,
including score and timeout left, rushing yards, and first downs for each team at that
particular time.
This display is like the one in we discussed above but designed using the domain
information for football and hence familiar to football fans. By clicking on the time line a
user can select any time instant and see what was the state of the game at that time. By
moving the pointer on the timeline, users can see how the game evolved. They could
filter events of their choice and set them in standard football representation—the football
field at the bottom of the screen. By double clicking on a play, they could get more
information about the play or watch a video of it through the event viewer? As they
watched scoring plays, from various angles, they could click on a player to get more
information.
Twenty five college football teams used this system. These fans could not watch the
game on national TV either because the game was not televised or because they were in a
wrong place, but could enjoy football games of their team in a compelling way using this
system. They could watch the same play of their favorite player from multiple angles to
gain insight in what really happened.
YearlySalesActivity
+TargetSalesAmount
+ActualSalesAmount
+SalesAmountDiscrepancy
What +TargetSalesCalls
Sales +ActualSalesCalls
Overall +SalesCallsDiscrepancy
By Customer
Customer1 QuarterlySalesAmount
Customer2 1
By Product Category QuarterlySalesCalls
Overall
Cat1 12
Overall
By Product MonthlySalesActivity
Product1 +TargetSalesAmount
Product2 +ActualSalesAmount
+SalesAmountDiscrepancy
Cat2
+TargetSalesCalls
Overall +ActualSalesCalls
... +SalesCallsDiscrepancy
Inventory
.... DailySalesAmount
Markerting
.... DailySalesCalls
Figure 6 shows an event graph for a sales forecast and inventory monitoring system
designed to monitor an automotive parts manufacturer’s key activities. These include
sales (monthly, daily, and hourly forecast target and actual for different sales regions) and
inventory (monthly, daily, and hourly available inventory for different warehouses).
Activities are rolled up temporally (hourly _ daily _ monthly) and by various “actors”
(customers, parts, parts line, and so on). Figure 7 shows a screen shot of EventViewer for
this application. Performance indicators for each activity are mapped to red, yellow, and
green based on domain specific criteria. The display in figure 7 is the close-up version of
the display in figure 4 shown earlier. It is easy to see that one can select different
geographic regions and different parts to understand what was going on in that part of the
world. These displays provide a holistic picture of the situation to an analyst who can
then drill deeper into the situation. The system allows those tools, but we will not discuss
those things here due to space limitations.
Figure 7: Another display of the inventory application. Compare this display to the
one in figure 4 to see how the system can be used in WYSIWYG mode.
Conclusion
Rapid advances in many related areas have brought in some interesting challenges for the
data engineering community. Traditional database techniques need to be reconsidered
and readapted for the new applications. Relational approaches are powerful and will still
be useful as a backend. But the front end of these systems requires data engineering that
is very different from what we have done so far. The challenge is take a more solution-
oriented perspective or be boxed into back-end repository management.
Some new attributes of data emerge as dominant issues: semantics, multimedia,
live, location sensitivity, and separate streams of sensor and other data. To unify all
sources of information, events appear to offer a powerful approach for modeling,
managing and presenting data. I believe that event-based experiential environments will
be useful in many emerging applications. The thoughts and ideas I have presented are
still in the conceptual stage. We have a long way to go in refining this approach to make
it practical, but it is clear we must take a new path, one that is outside conventional
thinking, if we are to keep pace with and enable these new applications.
Bibliography
[1] A. Katkere, S. Moezzi, D.Y. Kuramura, P. Kelly, and R. Jain, “Towards video based
immersive environments,” Multimedia Systems, Vol. 5, No. 2, pp.69-85, 1997.
[2] S. Moezzi, A. Katkere, D.Y. Kuramura, and R. Jain, “Reality modeling and
visualization from multiple video sequences,” IEEE Computer Graphics and
Applications, pp. 58-63, No. 1996.
[3] A. Katkere, J. Schlenzig, A. Gupta, and R. Jain, “Interactive video on WWW: Beyond
VCR like interfaces,” Computer Networks and ISDN Systems, Vol. 28, pp. 1559-1572,
1996.
[4] Ramesh Jain and Arun Katkere, “Experiential environment for accessing multimedia
information”, Proc. Of Second International Symposium on Multimedia Mediators,
University of Tokyo, Tokyo, March 7-8, 2002.
[5] Y. Roth, R. Jain, "Knowledge Caching for Sensor-Based Systems." Artificial
Intelligence, 2-24. 1994.
[6] Simone Santini and Ramesh Jain, “A Multiple Perspective interactive Video
Architecture for VSAM,” Proceedings of the 1998 DARPA Image Understanding
Workshop, Monterey, CA, November 1998
[7] Simone Santini and Ramesh Jain, “Semantic Interactivity in Presence Systems”,
Proceedings of the Third International Workshop on Cooperative and Distributed Vision,
Kyoto, Japan, November 1999.