Академический Документы
Профессиональный Документы
Культура Документы
For those of you who don't follow the news, or just happen to live under rocks, Microsoft
Research released a beta SDK for the Xbox 360 Kinect. If you don't know what a Kinect is,
then I will assume you do indeed live under a rock. The Xbox 360 peripheral has wowed
gamers since 2010, and now Microsoft has seen fit to release a potential SDK for the device.
In this article, I intend to demonstrate my first crack at the API. This article is targeted at
anyone interested in developing applications which make use of the Kinect. Novice coders
should have no trouble following what I did (since, in the Kinect world, I am a novice
myself!).
Top Articles
The requirements of this project are as follows:
Experts share their knowledge, helpful
tips and expertise.
Visual Studio 2010* (any edition should work, even Express)
WARNING: 5 Reasons why you should
.NET 4.0 (this should be installed with VS 2010, if not already installed)
NEVER fix a computer for free.
The following Microsoft Speech-related libraries are needed for speech recognition. Make sure you get
the x86 version of each library. This is because the Kinect SDK is built in x86 mode.
Watch Video Tutorials
Learn from the experts in these step by
- Speech Platform Runtime (v10.2) x86
step tech training tutorials
- Speech Platform SDK (v10.2)
Introduction to PHP
- Kinect English Language Pack (direct download)
Windows Powershell
* - I tried to get this to work with VS 2008, but VS had trouble recognizing the added Objective C Programming for...
reference to the Kinect DLL. It may work with VS 2008, but as of this writing I did not Programming in C#
figure out how to do so. (But hey, VS 2010 Express is free. Why not upgrade?) =)
See more tutorials
** - The beta SDK was designed around the XBox version of the connect, and at the
time this article was written there was no Windows Kinect. MS has since released a
Windows-specific version of the Kinect, and a corresponding SDK for that device. The Follow Experts Exchange
Windows Kinect SDK is incompatible with the XBox Kinect. MS Research did not
realease (to my knowledge) an updated version of the Xbox Kinect SDK, and the beta
SDK discussed in this article is the only choice available to you if you only have an XBox
Kinect.
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 1/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
There are a couple of things you should know about the SDK. As I mentioned previously, it
is in beta, so don't be surprised if there are bugs! The next thing is that the license
governing the SDK provides for non-commercial use. I'm not going to cover the license in
depth here, but if you use the SDK to create your own projects, make sure you read the
license thoroughly and understand what you are agreeing to. I am in no way legally
inclined and cannot offer advice as to acceptable use of the SDK.
My goals in this project were simple: become familiar with the API. Many of the samples
that come with the SDK are written to take advantage of WPF. I haven't had much
Recent Tech Solutions
experience with that technology (yet), and so I was compelled to create a Forms application
Learn from solutions other users are
that could utilize this API. I played around with the Skeletal Tracking capabilities and I also receiving from the tech experts.
dabbled in Speech Recognition. Let's first examine Skeletal Tracking, found under the
Microsoft.Research.Kinect.Nui namespace. MOVING PROJECT FOLDER
I must confess: this blew my mind once I got it working, which didn't take long after going Help with setting empty cells color of
by the sample project. I kept my project rather simple--rather than draw the traditional dataGridView with multiple columns
(VB.NET)
skeleton, as demonstrated in the SDK's sample project, I drew only dots to represent the
joints. We'll call it a poor-man's motion-capture studio. In order to play with Skeletal Search tech solutions
Tracking, you'll need to understand some of the classes which fall under this feature's
namespace.
Checking for the InvalidOperationException is very good idea, as it will tell you if the API was
unable to find/communicate with your Kinect. The Runtime class exposes a few events that
pertain to the arrival of each kind of visual data the device can gather. For this project, the
event of importance is SkeletonFrameReady. Adding a handler to this event will give us the
opportunity to interact with the Joints calculated by the device (more on these later). This
was all that I needed to get started with tracking my movements via the Kinect. Now for
picturing myself!
SkeletonFrame
Cameras, even video cameras, capture images as frames, which, when equating to
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 2/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
something tangible, could be thought of like a Polaroid (if you're old enough to remember
what those are). In the case of the Kinect, a frame is a single "image" as captured by one of
the sensors. The skeletal tracking system has its own notion of frames as well. A frame in
this case is the instantaneous position of the collection of Joints that make up the skeleton.
Don't get two caught up on the notion of "single," though, as even though we capture one
"image" at a time, a skeletal image may actually contain two skeletons! Why? Well the
Kinect was designed for multiplayer capabilities (as in simultaneous users, not just Internet-
ready), and so it has the capability of capturing two simultaneous skeletons via its sensor.
For my project, I only focused on one skeleton (I just couldn't bring myself to share!).
SkeletonData
As I mentioned, SkeletonData stores the points recognized by the sensor. This class also
houses a few other useful members, such as Position, TrackingState, and UserIndex. For this
project, I focused on TrackingState and Joints, as demonstrated in the SDK sample. These
two members allowed me to project myself onto my form. The first member, TrackingState,
refers to whether or not the Joint is being tracked. I honestly can't describe what this
implies, as I would think all Joints would be tracked. The documentation is a bit thin on this.
The second member, Joints represents the collection of all points detected by the sensor.
What points are detectable? There are 20 points, in fact. They consist of:
Left ankle
Right ankle
Left elbow
Right elbow
Left foot
Right foot
Left hand
Right hand
Head
Left hip
Right hip
Left knee
Right knee
Left shoulder
Right shoulder
Spine
Left wrist
Right wrist
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 3/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
There is an enumeration named JointID that has an entry for each joint mentioned above. A
twenty-first entry, JointID.Count does not correspond to a joint; rather its value represents
the number of joints defined by JointID, which according to the documentation is useful for
looping through the Joints collection (or presumably other collections).
So, given all these data structures, how the heck do you make that darn black alien hot dog
do something cool? Let's see, shall we?
Working inside of the SkeletonFrameReady handler, we can loop through the skeletons
detected in the frame, and for each skeleton, we can translate the Joint point to a screen
point. I kept a class-level queue of Point structures for later painting. I used a queue
because it is much easier to work with than an array is. Here is what the translating looks
like:
Here's what's going on above. I loop through the skeletons in the frame (line 5). For each
skeleton, I check that the SkeletonData is in a state of being tracked (line 7). If it is, I proceed
to convert each Joint to a Point (lines 9 - 19). The SkeletonEngine class provides a couple of
useful methods. One of the methods, DepthImageToSkeleton is used to return a value
between 0 and 1 (I assume), and that value can be calculated against the client area of a
form or canvas for later painting. Notice in lines 15 - 17, I call DepthImageToSkeleton and
then multiply each of its out parameters against its respective dimension with regard to the
ClientRectangle of the form. This essentially translates the image from "camera space" to
"application space." The math for converting these values was extracted from the sample
project.
You will notice that there is no actual drawing here. I am merely translating the points and
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 4/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
storing them to an array. The drawing occurs when I override the Paint method of the
form. What, you don't believe me? See for yourself:
Here I just loop through the Point array and draw the points as 11-pixel-diameter circles. To
make sure this code is called appropriately, notice the call to Invalidate as the last thing the
SkeletonFrameReady handler does. The combination of these groups of logic is what brings
the app to life:
It's Alive!
You probably noticed a bit of code in the SkeletonFrameReady handler that I didn't discuss
earlier. Well I didn't want to just settle for being a "dancing queen," so I decided to
implement the ability to click a button with my hand. I'll warn you now, it's not as
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 5/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
extravagant as I would like (I'd rather actually push to indicate a button press rather then
simply hovering over it). I believe I would need to incorporate depth tracking to make the
demo "pop" more, but for now, I'll hover.
In lines 21 - 38, I implement logic to check if the point representing my right hand has
stayed in generally the same spot while a timer ticks down. I put a threshold of 15 in either
direction as my algorithm for detecting "hovering." If I breach the threshold, then I stop my
timer. If I'm within my threshold, and my timer is not running, then I start it. For this
project, my timer's interval is 3 seconds. I also set the position of the cursor to follow the
point representing my right hand.
It's just a handler for my timer's Tick event. I opted for the Win API for simulating the
mouse click. I couldn't find anything in the framework that would offer this. (Yes, I could do
a Button.PerformClick, and I actually did when I first started, but I wanted to be able to click
the "OK" button on the resulting message box. I did not want to create a custom form and
do Button.PerformClick there also just for this purpose.) For those of you unfamiliar with the
mouse_event Win API function, it is imported thusly:
1: [DllImport("user32.dll")]
2: private static extern void mouse_event(
3: UInt32 dwFlags, // motion and click options
4: UInt32 dx, // horizontal position or change
5: UInt32 dy, // vertical position or change
6: UInt32 dwData, // wheel movement
7: IntPtr dwExtraInfo // application-defined information
8: );
Link
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 6/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
Click-tastic!
Speech Recognition
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 7/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
As I mentioned earlier, I wasn't content on just dancing around my form (although it was
thrilling at the time). I decided to experiment with speech recognition. This really isn't a
Kinect feature; rather you use the Kinect's microphone to receive the audio and this data is
then forwarded to the Microsoft Speech API. If you attempt this part of the project, make
sure you grab the libraries listed in the requirement at the beginning of the article. Pay
special attention to the note regarding the x86 versions of the libraries--this is important.
KinectAudioSource
The Microsoft.Research.Kinect.Audio namespace is the container for all things Kinect audio.
The KinectAudioSource class more or less represents the subsystem which acquires audio
data from the Kinect. Declaring an instance of this class will give you an interface to
receiving audio data from the device.
Note: much of the Speech API is not documented very well. I will do my best to describe what I
interpret the following classes to do, based on the samples I've looked at and the API
documentation (or lack thereof). I will try to keep an eye on the documentation for the Speech
API, and if it becomes up-to-date, I will update this article accordingly.
RecognizerInfo
This class contains data about a speech recognizer installed on your system. The member of
this class you will care about is the ID property, which identifies a speech recognizer. At the
time of this project, I believe I read that only US English was supported by the Kinect SDK;
perhaps this will change with more interest in the project or an official SDK release. Also at
the time of this project, the only supported recognizer is identified by the id "SR_MS_en-
US_Kinect_10.0". You can get this recognizer from the third link under the MS Speech
requirements section above. (As a side note, make sure you have VS closed when you install
these libraries so they get registered with VS appropriately. I believe the file
Microsoft.Speech.dll doesn't get put into the GAC (Global Assembly Cache), and you have to
add this reference manually by browsing to it. On my system, this file was installed to
"C:\Program Files (x86)\Microsoft Speech Platform SDK\Assembly\Microsoft.Speech.dll".
Adjust your path accordingly.)
SpeechRecognitionEngine
This class represents the actual recognizer installed on your system and you initialize an
instance of it by passing in the ID of the recognizer as found in an instance of a
RecognizerInfo object. It exposes a few events which you can use to take action when a
piece of speech is identified ( SpeechRecognized ) or rejected ( SpeechRecognitionRejected ),
or when your current recognition engine takes a guess at what you said (
SpeechHypothesized ), as well as few other events. My experimentation only dealt with
recognized text.
GrammarBuilder
I'm going to have to surmise that this is something like a StringBuilder, but for a grammar.
In looking at the samples, a GrammarBuilder takes in a list of "choices" and factors in what
culture those choices are described as. My guess is that it generates what the "choice"
would sound like in the specified culture. That is only a guess.
Choices
The "choices" the GrammarBuilder uses to generate a word's sound are basically a list of
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 8/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
words, each its own string added to the Choices object.
Grammar
The Grammar uses the GrammarBuilder to generate the sound representations, and
subsequently stores those representations internally. Once these representations are
created, the Grammar can be loaded into the instance of the SpeechRecognitionEngine.
My example is going to differ a bit from the SDK examples, as I used a Forms app (the same
one as the NUI demo above, in fact) for my experimentation. The only real difference is that
the SDK example used a Console application and did everything inside the Main method,
and thus had a local scope on everything. For my demo, I used some form-level variable to
track my SpeechRecognitionEngine instance, in addition to a couple of other instance
members. One very important thing to take note if you do a Forms app: you must use the
MTAThread attribute in order to prevent a nasty exception from cropping up. The API itself
apparently has some threading going on, and you need this attribute to account for this.
I initialized a new instance of the KinectAudioSource class so that I can have interaction with
the audio device. The next few assignments are taken straight from the example in the
SDK. Setting the FeatureMode to true allows modification of some of the devices features.
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 9/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
One such feature is "Automatic Gain Control," which according to the comments in the
sample, needs to be off for speech recognition. The assignment of OptibeamArrayOnly
means that "Audio Echo Cancellation" is not being used (the other option is
OptibeamArrayAndAec, i.e. with "AEC"). The other group of options under the SystemMode
refer to "SingleChannel," so I assume that "OptiBeam" refers to stereo, but don't quote me
on that. Once I initialize the KinectAudioSource instance with these settings, initialization of
the recognizer can begin.
The last steps for this speech recognition demo are to tell the KinectAudioSource to start
"listening" to me. Thus we call the Start method. This method returns a Stream object, and
this object should be captured as a reference so that it can be passed to the recognizer. This
Stream needs to be properly closed when the application finishes, so this is another reason
for maintaining the reference. The next call to the recognizer's SetInputToAudioStream
method takes in the stream just created and also sets up some sampling information for
acquiring audio data. I copied this from the sample directly because I am not terribly
familiar with the aspects of audio capturing. You can experiment as you see fit. Once this
point is reached, all that remains is to tell the recognizer to start recognizing, which can be
done, conveniently, with a call to the recognizer's RecognizeAsync method. I used the
"async" version of the the Recognize family of calls so that my constructor would return and
not block. You can use the synchronous (blocking) methods if you like, but make sure you
do it properly (i.e. don't block where blocking wouldn't make sense, like a constructor).
I maintained references to the following for easy cleanup once the form was told to close:
KinectAudioSource, SpeechRecognitionEngine, and the Stream returned by the call to
KinectAudioSource.Start.
And so ends my first dabble into the world that is Kinect. I have to say that I had a BLAST
doing this project, even given how trivial it is. My hope with this article is to pique the
interest of all from novice to expert. This project was developed in the course of one day.
The possibilities of this API are quite expansive, given the proper amount of time and
design. As I play with the device and the API more, I will try to post new articles regarding
usage, such as interacting with the video camera. I'm sure as interest in the project
continues to escalate, so too will the quality of the API. The tools are out there. Don't be
intimidated; give your imagination a workout. Who knows what Kinect-ions you might
create = )
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 10/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
Yes, the project is called "Minority Report." The reason is because I told everyone at my office
that I was playing with the Kinect API and that my goal was to give my next demo using the
Kinect and have a Minority Report-like interface!
References
Comments
EXPERT COMMENT
A great article particularly of interest to early adopters of the Kinect API - thanks :)
EXPERT COMMENT
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 11/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
kaufmed:
Well done with this Article!
You are definitely ahead of the pack on this one.
AUTHOR COMMENT
@younghv
EXPERT COMMENT
Thanks for this article kaufmed, very interesting. I've often wanted to do write
some motion capture tools and Kinnect is certainly one possibility. The video is
especially interesting, you can that see that it has a low frame rate (bit of a problem
for controlling games) and some inaccuraces in the data (the feet keep twisting
around).
Unfortunately using this myself would require me to jump into the MS upgrade
cycle (Windows 7 and .NET), the time and cost just aren't worth it., so I'm looking
for an alternative system. Still, I could see Kinnect being used for motion capture
by a small game company.
AUTHOR COMMENT
Hi satsumo,
Thanks for reading! I'll have to look at the docs again as I'm not sure of the frame
rate on the skeletal tracking. I have begun playing with the color camera, and from
the examples in the SDK, the framerate for color runs at about 30 fps. I would
assume the skeletal tracking is that or less (probably less). I have just begun with
this kind of programming, so I am by no means an expert at video capture or game
programming.
I understand the desire not to lean MS. I work as a C# programmer, so the jump was
small for me. While you wouldn't be allowed to port the Kinect API discussed above
due to licensing restrictions, I believe some of the open source projects discussed
porting their APIs to Linux. I played around with the OpenKinect API in .NET, but I
didn't get very far with it. I believe they are one of the groups who were planning a
Linux port.
EXPERT COMMENT
I would guess its skeletal tracing framerate to be around 20-15 fps based on that
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 12/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
video. An active depth scanning camera wont be as fast as a passive light receiving
camera and the algorithm for figuring out where the body parts are in the depth
scan must be very complex. It's impressive that it works at all.
I can see why MS have opened up the SDK, apart from selling extra hardware it
allows people with more imagination to do something with the technology. MS have
never been innovators. I will take a look at OpenKinect, thanks for the information.
By the way, I dont use Linux, I just don't upgrade Windows unless I have a good
reason. Adopting Windows 7 would slow my PCs to a crawl. I'd have to buy new
machines, that would cost and yet make no difference to what I do. I don't install
NET for similar reasons, I can't see what benefit it offers, and its an enormous
resource hog. I think this has a lot to do with my background in consoles, where
programs are vastly more efficient than with Windows.
AUTHOR COMMENT
Understandable = )
AUTHOR COMMENT
I got curious and decided to test your assumptions about fps. I used the logic from
one of the examples to track the fps. I believe this is a coarse measurement, but I
believe it is close enough to affirm your assumptions. I am attaching a screencast
with the new logic.
Keep in mind that my physical environment may not be ideal either. I performed
the demo and test in my house and the tests were done at night, so I am using
household lighting. I also have the Kinect set up about 10 feet from where I stood
during the demo. In my experience with some of the Kinect games, this fringes on
the minimum distance for satisfactory recognition by the game. There may be some
variation in the numbers and performance of the demo due to the nature of my
setup.
AUTHOR COMMENT
Here is the timing code I incorporated from the sample. It was placed inside the
SkeletonFrameReady handler.
1: totalFrames++;
2:
3: if (cur.Subtract(lastTime) > TimeSpan.FromSeconds(1))
4: {
5: int frameDiff = totalFrames - lastFrames;
6: lastFrames = totalFrames;
7: lastTime = cur;
8: frameRate.Text = frameDiff.ToString() + " fps";
9: }
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 13/14
04/05/13 Getting Your Application Kinect-ed: c#, .NET, VB.NET, Kinect, XBox
EXPERT COMMENT
Here's an interesting one ... have you tried running this and then bringing a chair
into the focal range of the Kinect camera? It maps the chair as a second skeleton!
Also lots of fun to be had if you can find one of those fitness balls (or a large sized
beach ball!) - let the Kinect map it and then try bouncing the ball!
EXPERT COMMENT
when you mentioned the color recognition it made me think of an easter egg where
if you wear a certain t-shirt you will get extra power ups.
AUTHOR COMMENT
Hi r1tman2003,
Microsoft Digital Living Security Developer iOS MS SQL Server Exchange WordPress
Apple Virus & Spyware Programming Storage JavaScript Android C# Oracle Database
Internet Hardware Web Development OS MS Access Visual Basic .NET Outlook Visual Basic Classic
Gamers Software Networking Database Java MS Excel PHP .NET Programming
Mobile Site © 1996-2013 Experts Exchange, LLC. All rights reserved. Covered by US Patent. Asker Certified
www.experts-exchange.com/Programming/Languages/.NET/A_6259-Getting-Your-Application-Kinect-ed.html 14/14