Вы находитесь на странице: 1из 22

Functionality Requirements for the

Successful use of 3D Environments


Key Learnings from Pilot Projects Utilizing
Immersive 3D Platforms and Technology for Applications
Involving Synchronous Training and Collaboration

May 20, 2010


Table of Contents
Abstract .....................................................................................................................................................................3
Introduction ................................................................................................................................................................4
Sound, Volume and VoIP Control ..............................................................................................................................4
Individual Microphone Mute and Volume Control ..................................................................................................4
Open Microphones and Echo Cancellation ...........................................................................................................5
Positional Sound ...................................................................................................................................................6
Multi-Channel VoIP and Proximity Zones ..............................................................................................................6
Dial-in Capabilities .................................................................................................................................................7
Individual Granular Control of Attendees ...................................................................................................................8
Content Control .....................................................................................................................................................8
Mobility Control......................................................................................................................................................8
Speaking Control ...................................................................................................................................................9
Appearance and Animation Control .......................................................................................................................9
Presence Control...................................................................................................................................................9
Moderator Control .................................................................................................................................................9
Access: Firewalls and Proxy Servers .........................................................................................................................9
Behind the Firewall Installations .......................................................................................................................... 10

Flash and Java - based 3D Applications ......................................................................................................... 10
Independent Firewall Friendly Solutions .............................................................................................................. 10
Content Integration .................................................................................................................................................. 11
Screen Sharing.................................................................................................................................................... 11
Host Controlled Distributed Content .................................................................................................................... 11
Browser Content.................................................................................................................................................. 13
Fidelity and Realism ................................................................................................................................................ 13
Avatar Face Fidelity ............................................................................................................................................. 13
Realism ............................................................................................................................................................... 15
Focus and Eye Contact ....................................................................................................................................... 15
Self Expression and Non-Verbal Communication................................................................................................ 16
Ease of Use ............................................................................................................................................................. 16
Interacting with Objects ....................................................................................................................................... 16
Navigation ........................................................................................................................................................... 17
Gesturing ............................................................................................................................................................. 18
Set and Forget..................................................................................................................................................... 18
Viewing Content .................................................................................................................................................. 19
Conclusion ............................................................................................................................................................... 20
About VenueGen ..................................................................................................................................................... 20
Appendix: Capability Checklist................................................................................................................................. 21

2
Abstract
The purpose of this paper is to assist those who are using or considering the use of 3D virtual
environments. It is a guide to identifying and prioritizing a process for selecting and evaluating
3D technology, functionality and virtual platforms. It is written to assist both technical and
nontechnical individuals who may be tasked with teaching, informing, collaborating, supporting
or persuading a geographically dispersed audience and who are considering the use of various
technologies and platforms to create immersive 3D virtual environments as their modality. It is a
practical guide based on the first hand experiences of those who piloted various virtual 3D
technologies for these applications. These findings are relevant to almost all 3D applications for
business and learning with the exception of custom built simulators. It is formatted as a
summary of technical requirements, best practices, pitfalls and techniques that can be utilized to
create virtual experiences.

Questions this paper answers

1. What are the most common technical problems and obstacles reported by those utilizing
3D virtual environments technology today?
2. How are 3D platform vendors innovating today to address the issues reported by those
who pioneered the use of their technologies?
3. What technical capabilities are considered most essential for conducting consistently
successful virtual meetings?
4. Why is granular individual VoIP control a requirement for all immersive meetings?
5. How important is graphical fidelity and realism to creating immersive 3D environments?
6. What are the benefits and drawbacks of the various technical approaches vendors use
to integrate content into 3D virtual environments?
7. What are the most common constraints that firewalls and proxy servers place on the use
of 3D virtual technology and what are the various approaches platform vendors use to
address these challenges?
8. What are the most common interface design mistakes that can cause 3D environments
to be confusing and difficult to learn and use?
9. Which technological features are most important for driving adoption of 3D technology
within most organizations?

3
Introduction
3D environments can be incredibly engaging, interactive and cost effective when supported by
the right technology and functionality. For several years now, trainers, business professionals
and educators have been experimenting with 3D technologies and techniques. Their pilot
projects generally were targeted at one or more of the following goals:

o Achieve better distance learning outcomes


o Create more engaging personal and cost-effective virtual meetings
o Increase productivity and creativity through online collaboration
o Extend the reach and accessibility of real-world meetings and events

Over the last three years, these initiatives reported mixed results. Early adopters of the 3D
modality endured many obstacles including immature software and limited technical functionality
to gain early firsthand experience with what many believed could be a major advancement in
how we meet and learn. These pioneers were excited about the potential of a more engaging
and personal online experience. They were emboldened by numerous studies and surveys that
indicated how almost all of the important metrics from participation to retention seemed to
dramatically improve when virtual attendees were immersed in an environment that felt and
functioned more like a real-world experience. Business professionals piloted 3D initiatives in
the hopes of moving their distance presentations beyond passive screen sharing to a more
natural and personal interaction with staff, partners and customers.

Interviewing those who piloted 3D virtual initiatives has been encouraging in that they, for the
most part, still believe in the promise of immersive distance meetings and training. Todays
generation of 3D technologies are delivering successful solutions and are beginning to address
the issues raised by these early adopters and to incorporate the specific functionality they found
to be critically important.

Following is a listing of some of the technical factors and capabilities cited as most important to
any successful 3D immersive experience.

Sound Volume and VoIP Control


First and many would argue foremost, sound has to be right in order to conduct successful
online meetings. You only need to attend one or two virtual events to realize how critically
important granular VoIP control is to the meetings success. The CEO of Vivox Corporation put
it best, If you dont have spot on VoIP then you dont have anything when it come to virtual
gathering. There are several technical requirements to making VoIP work well in virtual
environments.

Individual Microphone Mute and Volume Control

Most 3D platforms today still tend to put all virtual attendees into a single VoIP channel with one
master volume for the group as a whole. This approach is the reason for most virtual meeting
starting late as each participant is asked to adjust his volume manually one at a time based on
feedback from the group. Often attendees are unable to figure out how to get their microphone
volume to an acceptable level because their headset is managed by volume controls found in

4
the operating system control panel, proprietary device driver software and a physical control on
the headset itself; any of which could be the problem. Add to this confusion the fact that there
are often multiple microphone control slider bars in Windows, line-ins and microphone boost
options hidden in various windows depending on which version of the operating system you are
using. It seems simple enough to ask someone to turn their microphone volume down but in
fact they may not be able to figure it out and unfortunately one bad volume can ruin the entire
groups experience.

The answer is simple. Each attendee must be able to control every other attendees volume
individually for themselves. In other words, if Bills volume is too low I should be able to click on
Bills avatar and increase his volume but only for me (i.e. this will not affect how others hear
Bill). With this ability each attendee simply makes a few adjustments on the fly as others are
speaking and the meeting just hums along. Without individual volume and mute control you are
asking for a frustrating experience.

Some vendors now offer auto-volume leveling technology which attempts to automatically adjust
each participants volume to a similar level. This technology is based on quick initial samplings
and can help some but it is not a replacement for individual volume control. Most experienced
VoIP users will agree that it is rare for any auto-leveling technology to adjust every attendees
volume perfectly. Individual and granular control of the volume at which you hear others is an
absolute must for painless virtual meetings.

Open Microphones and Echo Cancellation

A single attendee or student with an open microphone can ruin your virtual training class or
meeting. If a participant does not have a headset, the sound of others speaking comes through
his speakers and into his open microphone which then rebroadcasts the sound back to those
who are speaking. This is extremely frustrating as speakers hear everything they say echoed
back to them making it almost impossible to focus on what they are saying. It is often very
difficult to figure out which attendee in a larger group has the open microphone or even to
explain to them what the problem is when communications are so hampered.

Some vendors have attempted to address this issue by adding a push to talk feature that
forces virtual attendees to hold down a particular key while speaking like using a CB radio.
Ones microphone is muted automatically except when the talk key is being held down. This
does stop open microphones echoes but it also adds a level of complication to the virtual
meeting as communication becomes less free flowing and natural. Attendees often forget to
hold down the required key when they talk leading to long silences before apologies and
backtracking.

Not everyone has an available headset and even if they do they still may not be able to turn off
a built-in open microphone in their laptop, etc. The best solution to this problem is having a
feature that gives the host control to force only the attendees with open microphones into push
to talk mode and leaves everyone else as is.

Echo cancellation is a sophisticated technology that is also targeted at this problem. It attempts
to automatically remove echoes from open microphones via sound pattern recognition. Echo
cancellation sometimes works well but other times it does not work at all. Echo cancellation and
granular host control of push to talk are both required capabilities for any online meeting of any
size especially if there will be attendees that are new to virtual meetings.

5
Positional Sound

Positional sound significantly enhances the virtual meeting experience and dramatically reduces
audio fatigue for attendees. Positional sound attempts to simulate how we hear sound in the
real world. It allows virtual attendees to hear others speaking predominately through their left or
right speaker based on the relative location of others to them in the virtual space. It also
reduces the volume of others proportionately to how far away they are from the listener.
Positional sound gives acoustical clues as to where others are virtually standing or sitting and
as such significantly increases the sense of presence and immersion in virtual environments.

This effect is hard to describe without experiencing it firsthand. For example, when someone at
the end of the table begins speaking, without even seeing that person, attendees automatically
know that the speaker is sitting to their left several seats away. When compared to everyone
being in a non-positional VoIP channel i.e. a single volume without stereo, the difference is
literally miraculous. As will be discussed later, fidelity is an important part of what makes a
virtual environment work and positional sound is critically important to creating that sense of
presence. In the real world, we tend to look at other people when they speak to us. Without
directional sound clues, no one knows where to look so they either begin scanning the virtual
room trying to see whose mouth is moving or they simply stare off into space not even
attempting to look at the speaker which creates an equally weird effect.

Audio fatigue is a known and well-documented problem in the telecom and virtual meeting
industries. We are conditioned since birth to associate sound with a source. Presenters at the
2009 Basex Conference on Teleconferencing argued that when attendees cannot see who is
speaking as much as 30% of their mental energy is spent trying to fill in the missing data
asking who said that and who was that said to. That is a potential 30% less mental cycles
available to focus on what is actually being presented. This effect is the reason why we tend to
feel more tired and mentally drained after a long telephone conference call than we do after a
real world meeting.

Positional sound is not just a VoIP gimmick, it frees up mental energy and focus which enables
attendees to learn and retain more while making better and faster decisions. It is an important
capability that will help your virtual attendees reduce audio fatigue. It will also enable them to
quickly look at others who are speaking to communicate their attention which dramatically
increases the fidelity and realism of the virtual experience.

Multi-Channel VoIP and Proximity Zones

A single VoIP channel works OK for smaller virtual gatherings but for larger classes, meetings
and events with dozens of attendees a multi-channel VoIP solution is required. Imagine you are
seated with a few of your colleagues at a virtual lecture or conference with fifty to a hundred
other attendees. You need to hear the presenter or panelists but you also would like to be able
to make comments to your colleagues sitting near you. If everyone in the room is in a single
VoIP channel then you will hear everyone in the room, thus making it very difficult to hear the
presenters. This is especially true in a positional VoIP channel where you hear those closest to
you the loudest. Single channel VoIP solutions try to avoid this problem by muting those who
are not presenters but this removes your ability to whisper to colleagues and it creates an
unnatural silence in the venue that dramatically reduces the fidelity of the experience.

6
A two-channel VoIP technology addresses these issues allowing each attendee to be in both a
small directional proximity-based VoIP channel and an all inclusive non-directional VoIP channel
simultaneously. The large inclusive channel is used by presenters like a house microphone.
Everyone can hear the house
channel but only presenters and
those granted permission can
speak into it. This has the effect
of a real world house
microphone or PA system.
Anyone using the house
microphone VoIP channel can
be heard clearly by everyone
regardless of distance.

At the same time, attendees are also in a small proximity-based VoIP channel into which they
can speak and hear. Small proximity-based VoIP zones can encompass each attendee and a
few seats to either side. They allow colleagues to whisper to each other and share insights
without disturbing the entire room. If the proximity zones are sized properly they will also
provide just the right amount of ambient noise to create the sense of being in the audience at an
event of similar size and crowd. If an attendee is seated close by (within the proximity range) of
another chatty attendee he can individually mute that person if the VoIP channel also supports
this feature.

Getting directional, non-directional and proximity-based VoIP ranges right is as much of an art
as it is a science. The size, shape and configuration of the virtual venue must be taken into
consideration. For example, in a large lecture hall one would want small individual proximity
zones but in a smaller, more collaborative, venue such as a board room with facing chairs one
would want a single proximity zone that includes all attendees. In effect, the VoIP zones and
proximity sizes must be configured per venue and for the meeting experience that venue is
designed to facilitate.

Larger training rooms are the most difficult of all venues to configure correctly from a VoIP
perspective because of their duel functions. For example, at times a trainer may want to use
the class room like a lecture hall with a very small proximity VoIP zone for each student. At
other times the teacher may want to play the role of a facilitator for a class open discussion
requiring everyone to be in the same proximity-based VoIP zone. Other configurations might
include larger proximity zones for teams meeting in the corners of the room. The best VoIP
system of all is one enabling the event host to dynamically adjust the size/range of the
attendees proximity based VoIP zone on the fly. Preconfigured settings/use-cases make for
the simplest user interface here such as a menu option that allows hosts to configure their
virtual training room for lecture, open discussion, etc.

Dial-in Capabilities

Dial-in refers to the ability to dial a telephone number that connects the caller to a VoIP channel
within a virtual environment. Most 3D platforms, if they support dial-in at all, think of it in terms
of an attendee who does not have access to a computer at the meeting time and is therefore
limited to audio-only participation. This is a common use case but a far more common use case
involves an attendee who does not have a headset or whose firewall settings do not allow VoIP
traffic. These participants can be and should be virtually present. They have an avatar, see

7
content and can text chat with others. They just need a sound solution that incorporates their
virtual presence.

The best dial-in solution is one that turns the attendees telephone receiver into a VoIP headset
for all practical purposes. Although directional sound is not possible with a single-ear telephone
receiver, the participants avatars lips should move as the user speaks into the phone. This
capability alone dramatically increases the number of attendees who can participate in a virtual
training event because it makes the event available to those who do not have a headset or
cannot connect to a VoIP channel.

Individual Granular Control of Attendees


In general, individual granular control refers to an instructor or meeting hosts ability to control
each attendees experience and permissions individually and not just as a group. This
capability is very important to achieving both the level of control and audience participation that
makes for the best interactive virtual experiences. Individual granular control falls into several
categories.

Content Control

Meeting host/instructors must be able to pass content control individually to an attendee. The
goal of 3D immersion is increased engagement. Allowing others to take control of content
allows them to be more than passive observers. It enables them to demonstrate their learnings
and participate interactively in the process. The ability for an instructor or facilitator to remove
this permission is equally important to maintain control.

Mobility Control

Meeting host/instructors must


be able to individually grant
and revoke the ability to stand
and move about. This is
important especially in larger
virtual venues and public
events where one disruptive
attendee can compromise
everyones experience. The
ability to come to the front of
the class or lecture hall or to
move to a breakout session is
critically important to maintain
the feeling of active
participation and engagement.

8
Speaking Control

Meeting host/instructors ability to individually grant and revoke an attendees right to speak
publicly is equally important for both creating an interactive experience and maintaining control.
Attendees should be able to mute other attendees if they become a distraction. Meeting hosts
should have the ability to grant speaking privileges by VoIP channel to any attendee.

Appearance and Animation Control

The host/instructor must have the ability to control any factor that might compromise the
learning process or event. For example, meeting attendees or students should not be allowed
to dress or undress in a way that would be offensive or a distraction to others. Likewise,
students should not have unchecked abilities to perform gestures or animations that would also
prove counterproductive or disruptive. The best balance is to allow individual freedom of
expression while empowering the instructor or presenter with the ability to restrict any attendee
who crosses the line.

Presence Control

Meeting hosts and Instructors must also have the ability to limit access to and even expel a
disruptive participant. Corporate trainers often control access to sensitive and proprietary
information that must be kept secure. Public events particularly require the presenter to have
an ability to control access and to expel invalid or distracting attendees. This is one lesson from
public virtual worlds that cannot be ignored. The event host and those granted moderator rights
must have total control of their virtual gathering.

Moderator Control

Finally, instructors and meeting hosts must have the ability to give another virtual attendee or
instructor the ability to moderate. Moderator control is the ability to grant and to revoke all of the
abilities discussed thus far to another attendee.

Access: Firewalls and Proxy Servers


Most surveys and reports being published today continue to find access as the single biggest
problem facing the expanded use of 3D virtual technology especially for corporate training
applications. 3D applications are particularly offensive to firewalls because they tend to require
diverse types of internet traffic including VoIP, positional data and all types of content. No
matter how sophisticated the virtual offering, if your intended audience cannot participate then
your initiative will fail.

Corporate security administrators will open firewall ports for approved products that they
purchase after a detailed evaluation but they rarely will do this for the one-off person who wants
to attend a virtual class or event. The bottom line is that your chosen platform must run
seamlessly through the vast majority of corporate firewalls and proxy servers.

9
Behind the Firewall Installations

3D vendors tend to tackle the firewall problem in one of three ways. The first way is to require
that their platform be installed behind their corporate customers firewall. Historically this has
created a lot of difficulties for virtual learning pioneers. Besides adding upfront expense, risk,
time and administrative complexity to the pilot project, this approach also only resolves the
firewall issue for employees that are in fact behind that particular firewall. When they try to use
the platform to include customers, business partners or anyone who does not work for their
company, they discover that they have not solved their firewall problem but rather they have
simply moved it.

Flash and Java - based 3D Applications

A second way that 3D vendors attempt to work around firewalls is to develop a 3D Flash or
Java-based application. Adobe Flash is a popular browser plug-in that allows movies and
simple interactive programs to run within a browser. Flash is very popular and as such it is
approved for download and use by most corporations. If a vendor can make their 3D
application look like Flash content then they can avoid many firewall obstacles.

The problem with this approach is that Flash content and interpretively run scripting languages
like Java are very limited graphically in what they can do compared to stand alone software
programs. Java and 3D Flash applications do tend to run through corporate firewalls but they
have rather limited functionality and offer very poor fidelity. This presents a problem because,
as will be discussed later, the fidelity of the virtual experience is important to how others use
and view this technology within their organization. In the tradition of Jeff Foxworthys popular
You might be a redneck monologue,

If your avatar looks like a cartoon;


If you are unable to turn your neck to look around;
If your virtual venue resembles Sponge Bobs living room;
Thenyou might be a Java or 3D Flash application!

Independent Firewall Friendly Solutions

The third and by far the most difficult way 3D vendors work around firewalls is to engineer every
aspect of their platform to run seamlessly through every type of firewall and proxy server
configuration. Very few vendors attempt this approach because getting this right, and
addressing the related security issues, etc., can be as time consuming and difficult as building
their 3D platform originally.

This is by far the best approach because it provides for a platform that is accessible from
anywhere, can be used for any application or target audience and provides for a full-featured
high fidelity experience. Such offerings can also be offered as a service running as a browser
plug-in. These applications require no firewall changes or complicated and expensive onsite
installation of servers. As such, they tend to require no upfront cash outlay or hardware and can
be purchased as a monthly subscription. Such pilots have far less risk and tend to be approved
faster.

10
Firewalls and proxy servers are no longer confined to large corporations. They are starting to
appear in small companies and even as part of home cable and DSL provider networks.
Firewall technology is also constantly evolving. Make sure that your chosen 3D platform not
only works well through existing firewalls and proxy servers, but also that your vendor is
committed to maintaining that functionality on an ongoing basis into the future.

Content Integration
A major component of most virtual training and meetings is the ability to share and collaborate
around various types of content. There are two ways to achieve this in a 3D virtual
environment.

Screen Sharing

The ability to broadcast ones screen buffer to others has been around for many years but the
ability to share that image on a viewer inside of a 3D environment creates a much more
immersive experience. Screen sharing is a common way to allow others to view your content
without actually distributing that content to their local computers. Screen sharing, often called
web conferencing, is the only way to allow others to see any real time edits you are making to a
document or other content.

There are, however, some major drawbacks to real time screen sharing if it is a platforms only
method for sharing content. Broadcasting many frames-per-second from your screen buffer is
very bandwidth intensive and as such generally displays poorly except on the most static of
content. Screen sharing is woefully inadequate for sharing videos for example and even
PowerPoint documents with embedded slide transitions and animations can appear jerky and
delayed. Worst of all, real time 3D applications using VoIP are typically already maximizing
available bandwidth. When real time screen sharing is added to the mix VoIP often becomes
scratchy and overall 3D performance can become slow and out of sync.

Host Controlled Distributed Content

Another more complex


approach for sharing
content within a 3D
environment involves
distributing the actual
content to all participants
so that it can run locally
on the users computer
while being controlled by
the instructor or meeting
host. For example a
PowerPoint slide show
or video can be
distributed to attendees
but controlled by the

11
meeting host. When the host decides to play or pause the video or to advance to the next slide,
a small command is broadcasted to all attendees who then see the content change as if they
were looking at the instructors copy of that content.

There are many advantages to the distributed document approach. Content, especially content
with moving graphics, displays smoothly just as if it were running locally because, in fact, it is
running locally. Another advantage of this approach is that it requires far less real time
bandwidth than screen sharing. More available bandwidth means better performance in general
within the 3D environment especially where VoIP is concerned.

There can be a major limitation to the distributed document approach if the 3D platform requires
that each attendee have the actual software viewer installed locally, i.e. PowerPoint, Word,
the QuickTime player, etc. Sophisticated platforms avoid this requirement by converting all
content to a common and ubiquitous format such as Flash prior to distribution. This removes
the requirement to have each content players or application installed locally.

Another limitation of the distributed document approach is that the distributed content must be
downloaded before it can be viewed. If large video files for example are added to a meeting or
training session they cannot be viewed by attendees until they have been completely
downloaded. Well architected platforms minimize this delay by allowing presenters to select
content for a virtual class or meeting when that meeting is created. Anyone who registers to
attend that event will automatically have the content downloaded prior the events start time.
This approach keeps almost all bandwidth available for use once the meeting has begun.

A final concern regarding distributed content is security. Ask to see your vendors security
whitepaper. All content should be converted and encrypted before distribution with temporary
unlocking keys sent only when that content is being used by the presenter or instructor. The
distributed and encrypted content should also be promptly removed from local hard drives as
soon as the virtual event ends.

Ideally, your 3D platform should


support both distributed content
and real time screen sharing.
Distributed content provides
better performance and a
superior viewing experience
while screen sharing is the only
way to edit content in real time
once a meeting has begun.

One last point to keep in mind is


that the way content is
reproduced within a 3D
environment varies widely from vendor to vendor with varying degrees of fidelity. For example,
most platform vendors support PowerPoint content but they do so by converting individual
PowerPoint document slides into static JPEG images. This approach looses all PowerPoint
slide transitions, animations, embedded video etc. Platforms that have the ability to run Flash
content natively within their 3D engines will have the most faithful content conversions
producing in-world documents that look and respond exactly as they do when launched from
your desktop.

12
Browser Content

Browsing the Internet and viewing web-base applications is a requirement for most 3D virtual
training and meeting platforms. To avoid screen sharing almost all 3D platform vendors who
support shared browsing use Linden Labs open source version of the Mozilla browser. It
allows an instructor or presenter to control each attendees local Firefox browser in such a way
as to appear that everyone is viewing the hosts browser within the 3D environment.

Unfortunately, there are many well-documented problems that have plagued the Linden Labs
browser implementation on Windows based PCs. Because the Mozilla browser is a third party
application it is not tightly integrated with the Windows operating system the way that Internet
Explorer (IE browser) is. For example, it does not share the Windows caching system or
information about what browser plug-ins are installed. This often creates unexpected results
causing some attendees to see one web page while others see something totally different.

Platforms that have taken the time to integrate the operating systems native browser into their
3D engine will have much better compatibility and provide a more consistent viewing experience
for virtual attendees. An added benefit of integrating native browsers is that they ship with the
operating system so they do not have to be included in the 3D platform vendors installation
process which reduces the average download size by dozens of megabytes.

Fidelity and Realism


There has been much debate about how important fidelity and realism are to the adoption and
use of immersive 3D applications. Fidelity refers to how lifelike and natural the avatars and
venues appear i.e. how faithfully they represent the real world experience they are simulating.

Some argue correctly that one can experience content and directional sound equally well with a
low polygon cartoonish avatar. Likewise one does not need shadows and high quality textures
to participate effectively in a class lecture. Some studies have even shown that children, for
example, often relate better to less realistic stylized virtual environments. Most 3D vendors,
especially those offering low fidelity applications, will point to older concepts such as the
Uncanny Valley which stated that most people would prefer to have a cartoon-like
representation of themselves rather than something that looks close to human but strange. In
other words between photorealism and a cartoon lies the uncanny value where 3D gets weird
and creeps people out.

Avatar Face Fidelity

Our ability to create photo-realistic avatars and environments has come of age and the most
practical argument for 3D fidelity involves avatar face creation. If a trainer, professor or
executive can create his avatar with good fidelity from an uploaded photograph then everyone
who knows him will instantly recognize him at virtual events. This is valuable because it
empowers virtual staff meetings to get off to a quick start as attendees do not have to introduce
their unrecognizable proxies. There are advantages when virtual attendees can tell immediately
who is waving at them and professors, at a glance, can know who is asking the question without
having to endure clouds of jumbled floating name banners obscuring attendees and content.

13
The demand and use of 3D photorealism today is unprecedented. We routinely watch movies
that flow seamlessly between real and computer-generated characters with most of us none-
the-wiser. We are told that the Uncanny Valley lies between photorealism and 3D art. If a
persons photograph can be converted to a texture and accurately mapped onto his avatar then
one could argue effectively that we have at last crossed over the valley to the side of avatar
photorealism.

Business professionals generally want their avatar to be a fairly accurate representation of


themselves so that they can be recognized immediately at virtual events. We talk about the
importance of getting face time i.e. the feeling of a personal experience with others. One CEO
was recently quoted as saying, My face is my brand. The president of a large university said
he could see tremendous applications for the immersive internet from distance learning and
faculty office hours to student recruiting and alumni meetings. However, he continued, but
theres no way Im going to address my staff or students looking like some cartoon! The same
sort of response has been echoed by many business leaders.

Many believe that 3D fidelity is important if for no other reason because it is linked to the
adoption of 3D technology in general. For years some management teams laughed off 3D pilot
projects even after they demonstrated excellent metrics and solid ROI for no other reason than
that they could not get past the fact that this technologys typical cartoonish characters remind
them of childrens games. Fidelity is important because it positions 3D meetings as a serious
tool that can provide faithful representations of the venues and people who teach, learn and
work within them.

The Proteus Effect is a well-documented principle of psychology. It basically states that the way
we act is influenced by how we feel about the way we look. In other words, humans have a
tendency to play the part based on how we feel about our appearance each day. It has often
been stated that people only buy into and support a virtual environment to the extent that they
buy into and relate to their representation of themselves within that environment. It is interesting
to watch the subtle changes in how focus group attendees act after they see their photo-created

14
avatar for the first time. They often stop referring to the avatar as it and begin to refer to it as
me. These subtleties are the essence of what creates the immersive sense-of-presence and
self that makes virtual reality immersive.

If for no other reason, fidelity is important because it can have a very positive impact on the
success of your pilot project and the future adoption of your 3D workplace applications.

Realism

Realism is an important part of fidelity that addresses how lifelike the 3D experience appears.
For example, do avatars move in a natural and convincing way? Do they make eye contact
when they look at others? Do their lips move appropriately while speaking? Can they
communicate using expression?

Realism is important for many of the same reasons that fidelity is important. Would you want to
use virtual conferencing and meeting tools if the fidelity was poor? For example, would you use
video conferencing if the video images on your screen where washed out and blurred? What if
video attendees did not look like themselves and appeared to jerk around in unnatural looking
ways? Probably not; it is important that 3D fidelity be of high enough quality to basically not be
all that noticeable or at least not be a distraction. The attendees mind should have minimal
obstacles in accepting the virtual venue and other attendees as reasonable likenesses of what
is being simulated. If others appear as flat cartoons incapable of neck movement or the ability
to walk without ice skating, then the business professional or student is constantly being
reminded that the immersion is not real. Realism is important because it helps the virtual world
to get out of the way so that attendees can focus on the goals of their gathering.

The key to an immersive movie viewing experience is the suspension of disbelief. We have all
watched low budget movies with poor and unconvincing graphics and sets. No matter how
great the dialogue or story line, you probably struggled to get into the movie. Many 3D
pioneers report a similar experience when using low fidelity platforms.

It can take additional years of technical and artistic polishing, even after a 3D platform has been
launched, to create consistently smooth and natural looking avatar movement. Sophisticated
3D platform providers, in business
for the long haul, know that this time
consuming and expensive effort is
critical to their products success and
the long term adoption of the 3D
modality.

Focus and Eye Contact

A fundamental question that


teachers and presenters ask
themselves many times in both real
and virtual meetings is Do I have
their attention? A huge advantage
that realistic high-fidelity platforms
afford is the ability for users to

15
communicate focus and attention. Users can move their focus causing their avatar to adjust
head, neck, eyes and posture appropriately so others can see exactly where they are looking.
When an avatar turns and look directly at you, then you know you have that persons attention.
This fidelity dramatically increases the overall sense of presence and makes virtual events
much more personal and engaging.

Self Expression and Non-Verbal Communication

Another important part of 3D fidelity and realism involves a virtual attendees ability to express
himself. If you watch participants in a real world meeting or class room you will immediately
notice that they do not just sit in passive trances. On the contrary, they constantly communicate
even when not speaking. They laugh, frown, fidget, slump, grimace and do a host of other
things to communicate their engagement in and feelings about what is happening or being said.
The ability for virtual attendees to use non-verbal communication is an important part of what
keeps them engaged and actively participating. In lectures, conferences and larger gatherings
we may not all be afforded an opportunity to state our opinions verbally but we can all still
communicate and this ability keeps us from becoming passive disengaged bystanders.

Although some platform vendors dismiss non-verbal communication as simply being cute or a
gimmick, 3D pioneers have discovered that it is these degrees of control that keep virtual
attendees from slipping into the all-too-common disengaged state associated with passive 2D
web conferencing.

Ease of Use
Over two consecutive years at one virtual learning conference the audience was asked to vote
on what it believed to be the single biggest obstacle to the adoption and use of immersive 3D
environments within their organization. Both years the number one response was the same; 3D
environments are too difficult to learn and use. The issues sited included installation,
navigation, gesturing and content integration. No matter how full featured and sophisticated
your 3D platform, if its difficult to learn and use you will have a major uphill battle reaping the
benefits it has to offer.

Rather than show specific 3D interface blunders (and they do abound) this section will focus on
some broad guidelines and generally accepted best practices for interfacing and controlling
ones experience in a virtual environment. Most of these best practices were derived from
surveys and from watching hundreds of hours of users who were new to 3D immersion,
attempting to use various interfaces without any instructions or previous training.

Interacting with Objects

A common mistake most platform designers make is in employing 2D interface design


techniques rather than taking advantage of the 3D environment itself. Finding an icon on a tool
bar that opens a menu of options may be your only choice when using a 2D product but 3D
environments can provide for much more intuitive interactions. The best 3D interfaces are
those that allow users to mouse click directly on the object, person or content with which they

16
want to interact. This can cause a default action or a small menu of options. For example,
clicking on a chair might popup a menu with the option to sit here. Clicking on an in-venue
view screen might offer options to add, advance, enlarge or close content, etc.

Rather than hunting through icons and menus, this interface approach limits what new users
need to remember to one thing, To interact with or control something just click on it. Interface
consistency is critical for ease of use so to interact with someone in the 3D environment users
should likewise simply be able to click directly on another avatar. This should produce a list of
actions that can be performed directly on the other person such as point at, change volume
level, view profile, or grant the ability to control content, etc. Interfaces that take advantage of
the 3D environment itself are the most intuitive, natural and easy to learn.

Actions that are the most common should always be the most accessible in the interface. For
example, the ability to point at something or someone is a very common and useful
communication tool. After watching and documenting the pointing references from hundreds of
hours of real world video of teachers, presenters and collaborators, it was discovered that nearly
25% of all non-idle movement involves reference pointing. Presenters predominately point at
their audience, themselves, and their content. By making the ability to point a function of
clicking on the object being pointed at, this high-runner interaction becomes intuitive, quick to
activate and easy for users to remember.

Navigation

A first problem that new users in 3D environments tend to face is navigation. How do I get my
avatar where I need to be or sit down at that table? Non-gamers have a hard time walking
around using arrow keys and tend to collide with things. They often lose orientation by looking
straight up at the sky or ceiling. This initial experience can be very frustrating and embarrassing
and it can even sour some new users against immersive 3D environments altogether. Many
issues can be minimized by having attendees appear already in their seats. If ambulatory,
users should not have to use arrow keys to walk around. They should simply be able to click on
the ground where they want to go and their avatar automatically navigates there avoiding
obstacles and other avatars i.e. click on what you want to interact with even the floor. Field-of-
view should also be constrained to help prevent users from looking straight up and becoming
disorientated.

Another common navigation problem can be easily avoided by limiting the size and complexity
of the virtual environment itself. Military simulators and games may need acres of land and
many buildings but the typical training class or boardroom meeting can be easily
accommodated in a single room. Modeling your entire campus or headquarters might seem
very cool but in reality you are probably just adding a lot of complexity and expense to your
project and creating additional navigation problems for your virtual attendees. The vast majority
of conference rooms, classrooms, etc. is fairly generic and can be covered using a relatively
small and cost-effective group of pre-built venues.

Likewise, flying, teleporting and using portals within a virtual world is also very cool but not the
kind of things people want to figure out when late for a meeting or class. The simplest and
most practical way to get to a meeting or class is to have it listed on a 2D webpage or as a
hyperlink in an email invitation. Once clicked, the virtual attendee should find himself seated in
the event and ready to go with no need for additional navigation or training.

17
Gesturing

As previously discussed, the ability to gesture and express oneself is an important part of
creating an engaging virtual experience. All 3D platforms support basic gesturing such as hand
raising, clapping and agree/disagree. These platforms, however, support a very limited number
of other gesture animations. This is mainly because it is too difficult for presenters to look
through a list of hundreds of potential gestures to find the right one on-the-fly while talking and
controlling content at the same time. Unfortunately, limiting the number of available gestures is
not a good solution either. A limited number of gestures that everyone in the room repeats over
and over again can dramatically reduce realism and does not provide the repertoire presenters
really need. This is an interface dilemma that has perplexed 3D platform providers for years.

A novel solution to this problem has recently emerged. The study of hundreds of cross cultural
gestures led to the concept of gesture archetypes. (VenueGen White Paper: Gesture Archetypes)
There are a limited number of gesture classes (types) that communicate the same core
meaning. The palms-up gesture archetype, for example, always communicates a lack or need
for something. Any gesture from this class communicates the same meaning.

Interfaces that use gesture archetypes are much simpler to use and significantly increase the
number and variety of available gestures. When a user selects an archetype his avatar will
automatically and randomly display one of dozens of appropriate gestures from that meaning
class. In other words, the presenter does not have to select a particular gesture from a long list
but rather he focuses on what he wants to communicate and the avatar gestures appropriately
from that gesture group.

Interfaces that allow multiple related uses from a single button or icon can also significantly
simplify an otherwise complex 3D interface. For example, a gesture icon can be single-clicked
to play a normal gesture archetype; double-clicked to play a high profile or more intense version
of that gesture or clicked-and-held to continue doing the gesture until released. This approach
not only feels very natural and intuitive it also provides support for hundreds of gestures through
relatively few onscreen icons.

Set and Forget

Another best practice for all


interface design but especially
important in 3D virtual
environments is the concept of set-
and-forget or autopilot. In less than
a minute, users should be able to
create a profile of how their avatar
speaks, gestures, sits, and acts so
they dont have to click icons and
buttons during meetings. Your
avatar should look and act like you
automatically. This interface
technique allows users to select
options or general behavior
patterns for their avatar as a default
so as not to have to manually drive

18
changes. An example of a set-and-forget behavior might include selecting how much one
typically moves his hands while speaking. Once selected, ones avatar randomly moves its
hands automatically (to the extent selected) whenever the user is speaking.

Posture is another excellent candidate for a set-and-forget interface. One should be able to
manually instruct his avatar to change its idle sitting animation from legs crossed to hands-in-lap
but does one really want to use mental energy to keep up with this? Selecting a default posture
level between formal and casual is much easier. The avatar then randomly changes postures
automatically from time to time but remains true to the formality class selected.

Set-and-forget interfaces can make a tremendous contribution to a 3D platforms realism and


ease of use. It is amazing to see ones avatar automatically acting like its owner. The less new
users have to do to appear natural and at ease in a virtual environment the easier it will be to
drive adoption of 3D technology in your enterprise.

Viewing Content

A final interface issue often sited involves how users view content. The ability to clearly read
content from anywhere in a virtual environment is critical for most types of virtual meetings and
training events. If a user is required to move his avatar close to content in order to read it then
there will be navigation and crowding issues. If the content pops up in another window forcing
the 3D window to minimize or shrink proportionately then a large part of the sense of presence
and immersion is lost.

The best interface for viewing content


involves two capabilities. First, users
should be able to zoom their focus in on
content without actually having to move
their avatar closer to it. This can be
accomplished via a mouse wheel, track
pad, or the plus and minus keys. This
feels very natural allowing users to
control their perspective. They can
zoom out to see speakers or zoom in to
read fine content or even choose to split
the difference.

A second important capability for viewing very small content is an in-venue floating window.
This technique forces the selected content to enlarge and float in reading position within the 3D
environment. This also feels natural simulating how a user might hold up a piece of paper while
reading it. When used, this function should also trigger an animation that shows the avatar
looking at a paper or handheld device to communicate to others that the user is concentrating
on the content.

These combined techniques empower simple and natural viewing of content while carefully
preserving emersion within the 3D environment.

19
Conclusion
Much has been learned in the last few years about what is required to successfully pilot the use
of 3D virtual meeting and learning technology. Buyers should carefully review vendors to
evaluate how well they have incorporated these learnings into their platforms. Most 3D
platforms today still lack many of the features and capabilities necessary to host high fidelity, full
featured, problem-free virtual events. Early adopters of virtual technology, however, remain
optimistic about platform vendors ability and commitment to meeting their needs.

The 3D immersive Internet is being adopted today for meetings, training, and conferences.
Virtual platforms can now be delivered as a service and have become accessible to smaller
businesses, providing a powerful competitive advantage, once only available to larger
corporations with huge budgets.

The age of 3D adoption is upon us. Virtual platforms are smarter, and users now have more
video power and bandwidth than ever before. Innovative businesses can improve information
delivery to a dispersed workforce, getting everyone in the same room instantly. Opportunity
exists today for businesses to use maturing 3D technology to transform processes, increase
productivity, extend reach, save time and reduce costs.

About VenueGen
VenueGen is a browser-based 3D immersive internet meeting platform that enables professionals to
meet, train, collaborate, share and present information quickly and cost effectively via virtual venues such
as boardrooms, classrooms and social halls. VenueGen customers simply select a meeting room, upload
any type of content, and instantly enter a high fidelity virtual room with directional VoIP. VenueGen
enables users to start realistic and immersive virtual meetings that are more personal and engaging than
typical web conferencing and more practical and scalable than video-based solutions. With VenueGen,
attendees communicate more fluently, make decisions and learn faster, and are more productive than
with other online virtual meeting technologies. No more boring conference calls, complex and expensive
video equipment or time consuming travel. VenueGen is Business Ready. Based in Research Triangle
Park, NC, VenueGen offers a 30-day free trial. If you have three minutes and an internet browser, you
have all you need to see the future of virtual meeting technology.

venuegen.com 919.228.4997 info@venuegen.com

20
Appendix: Capability Checklist
Following is a checklist of platform capabilities discussed in this paper and what many
experienced 3D pioneers believe to be must-have functionality for any successful pilot. This
checklist may be helpful in evaluating your current 3D platform or while in discussions with
vendors as part of your platform selection process.

Platform Requirement/Functionality Check List Availability

Sound, Volume and VoIP Control (pages 4-7)


1. Each attendee independent microphone mute and volume control

2. Microphone volume auto-leveling

3. Echo control ability to force push-to-talk (closed microphone)

4. Automatic echo cancellation

5. Direction/positional VoIP sound

6. Simultaneous multi-channel VoIP and configurable proximity zones

7. Dial-in (turning phone receiver into VoIP headset)

Individual Granular Control of Attendees (Pages 8-9)


1. Ability to individually grant and revoke content control

2. Ability to individually grant and revoke mobility

3. Ability to individually grant and revoke proximity zone speaking

4. Ability to individually give and retake house microphone

5. Ability to control attendee appearance and animations

6. Ability to control event access and expel attendees

7. Ability to individually grant and revoke moderator rights

Access: Firewalls and Proxy Servers (pages 9-10)


1. Ability to run as a service without behind-fire-wall installation

2. Ability to run as a browser plug-in

3. Independent firewall friendly engine (not a Java/Flash-based application)

4. Proxy server support


21
Content Integration(pages 11-12)
1. Support for real time screen sharing
Content Integration (pages 11-13)
1. Support for real time screen sharing

2. Ability for host to distribute and control content running locally

3. Ability to run Flash content and applications natively

4. Ability to convert content to Flash for 100% 3D compatibility

5. Integrate IE browser for full browser/OS compatibility

Fidelity and Realism (pages 13-16)


1. High fidelity photo-realistic (4000 polygon plus) graphics

2. Ability to create avatar face from uploaded photograph

3. Realism (suspension of disbelief, natural head and shoulder turning)

4. Ability to convey focus and attention (eye contact)

5. Facial expression

6. Self expression (posture, interest level, etc)

Ease of Use (pages 16-19)

1. Interact directly by clicking on objects, viewers and others

2. Simplified and minimized navigation requirements

3. Integrated gesture archetype interface

4. Set and forget automatic movements based on user profiles

5. Zoom focus capability to read fine content

6. Click to enlarge content to reading position within 3D environment

22

Вам также может понравиться