Last December there was a special on the Discovery Channel called "Earth 2050: Future Energy". I found this to be one of the more interesting programs that I've seen in quite a while. I have it DVR'd, but I was hoping to find a version for permanent record and to share with others as I think it is worth showing. Tonight I got the brainstorm to take a look on youtube (duh!) and found much of the content posted there, by none other than Shell. I thought it was sponsored by Shell when I watched it, so I was correct with that assumption.
I'm going to post links to all the videos on YouTube here, but two of them are of particular interest to me and were the two segments that caught my interest the most:
- Driven by Design, and
- Extended Interview with Cliff Fox
I find these interesting as he discusses the application of point clouds to energy scenarios, specifically automotive navigation and fuel efficiency. Honestly, he doesn't get into any details, but it strikes my interest due to the work that I've been doing with Kinect point cloud data, and since I build energy systems for a living...
The following are several stills from the first video showing use of the point cloud data, and the transposing of image data onto the point clouds.
This mapping of image data onto point cloud data is a concept that I've been playing with around with the Kinect and C#. I don't have a screen shot at this point to show, but it is easily accomplished using Processing and several plug in libraries and open source projects. If you want to play with this, I actually recommend getting the following text, "Making Things See"
(which is a great read):
The author, Greg Borenstein, shows how to do this task with code written in Processing that will do the image mapping onto point clouds, with the RGBDemo open source project courtesy of Nicolas Burrus
. I have this running on my mac and it works perfectly. I highly recommend it for experimentation. As an example, Nicolas has a video on YouTube (and his site) which shows 3D reconstruction of interior spaces, much like is shown in the Future Energy / Shell videos:
Now the Nokia videos are using a much larger scale depth imaging system with obviously a lot more storage. But I have seen this done using a Kinect...
Martin Szarski, on his blog, Decorator Pattern
, has an entry where he describes his adventure in strapping a Kinect to his car and going for a drive, recoding the point cloud data, car location data, and images
. An example of his result is just spectacular, and can be done for the price of a Kinect (and some gas):
So this leaves me to figure, what's to stop a number of people with inexpensive Kinect's to crowd source 3D point clouds and images, store them in a cloud database, and provide analytics?
I think I'll have to work on that one... Stay tuned...
While doing research on hand tracking on Kinect I came across a number of excellent resources. The following in an initial video that I found on YouTube from MIT which shows using two-handed minority report style navigation with the Kinect:
I like how the palm and finger tips are represented, and I've modeled that in my library, NuiDotNet. This video also demonstrates use of gestures and grabbing of items, which I have all the pieces put together in NuiDotNet, but I don't quite have a demo put together. That should be coming in the next few weeks.
This video led me into a lot of research on how to do this. This involved finding a bunch of research papers, and looking at a lot of examples on the Internet, many of which were found on YouTube. The one of most interest to me was CandescentNUI:
Nicely, the source to this is available at http://candescentnui.codeplex.com/. It provided an excellent framework for understanding the things needed to perform hand tracking. I've ended up making a bunch of modifications including using adding K-curvature algorithms, but I really have to thank Stefan for posting this. You can also go to his blog.
Since my last post several months back (about the PrimeSense device) I've been doing a lot of work with the Kinect. I've started work on a library that I'm referring to as NuiDotNet which I plan to put on codeplex and that assists you in building applications that integration NUI with the Kinect PrimseSense, WiiMote and voice recognition into your application. As part of trying to get the word out about it, I've decided it's time to start posting about some of the work that I've done with it. I'm also speaking this weekend at Houston Tech Fest 2011 and want to get content out for anyone that attends my session, and if you are there please do come by as I'd love your input.
The first video is one that I'll go over right now is one that shows the Kinect doing hand tracking. Now I've kind of jumped way ahead, well past things like skeleton tracking, to this point. Hand tracking is not something provided by Kinect or it's SDK (or OpenNI), so it involves a lot of custom programming that I will go over in this and subsequent posts. In this video I have placed my right hand about 1.5m from the front of the sensor and am running a simple piece of code written with NuiDotNet. I open and close the hand, as well as extend and retract several of the fingers to show the process in action.
To do this, there are a number of tasks that must be undertaken, on a frame-by-frame basis:
- Get the depth image from the Kinect
- Extract an appropriate view volume from the depth image
- Perform point cluster analysis on the view volume
- Generate an outline from the identified cluster
- Determine center of mass of the cluster
- Perform a modified K-Curvature algorithm to determine the location of the fingers
- From the finger information and K-Curvature results perform a least squares fit of finger outline to determine the direction that the finger is pointing
- Render the visual in WPF
So what does the code look like to do this? I've tried to make it very simple (on top of the covers) with NuiDotNet. The XAML is the following:
Very easy. One canvas, 'canvas', which is for the next demo, and one canvas, 'handCanvas', where NuiDotNet will render the hand. The following is the entirety of the code for the rest of the window, which I hopes to show the simplicity that I'm striving for in NuiDotNet:
Everything in NuiDotNet involves using some tip of [Device]NuiDataSourceFactory, which will create various forms of NUI based data streams for your application. Line 41 creates one for the Kinect. There is actually a layer of abstraction available that will even hide the specific devices from the application, allowing Kinect's and PrimeSense (OpenNI) devices to be specified via configuration, but that's for another post.
Most things in DotNetNUI then revolve around getting a DepthDataSource object, which represents the depth sensor data stream from the device. This is created in line 42. Once a depth sensor stream is available, it can then be passed into a number of other strategy objects or other data sources. In this case, it is passed into a factory method that creates HandDataSource (line 46), which does all the work to track a hand given a depth stream.
The hand data source takes several other parameters, a Clustering Parameter and Hand Parameter. A cluster parameter defines how to look at the depth data to find a hand. This particular subclass will find the point nearest the sensor and construct a view volume from that point to 500mm further back in the depth stream. The 75 parameter effectively specifies the "floor", below which the data is ignored; this is useful if you are sitting at a desk like I am. The hand parameters specify various options for the K-Curvature algorithm, which I'll not get into at this point.
NuiDotNet then provides a number of visualizers that can be used to render various data streams. In this scenario, in lines 48 - 54, a hand visualizer is created, specifying what canvas to render to, which hand data source to render, and which visual elements of the hand are to be drawn; in this case, I want to see the center of the palm, finger tips (the blue dots), finger tip rays (blue vectors/rays extending from the finger), and the contour of the hand.
It's that simple :)
To close this post, I include the following video is a small extension of the application to map the hand movement to the entire window, and using the pointing vector and hand location to determine if the ray from the finger tip would intersect any of the rectangles in the window, effectively showing how this can be used to select items in the application without actually grabbing (grabbing itself is supported by NuiDotNet and will be covered in a subsequent post):
This is a fairly simple extension using NuiDotNet constructs. I'm not going to get in to the details of ray intersections with rectangles, but how this is done is by hooking into the hand data sources NewHandDataAvailable event, which is passed a hand object, which has properties such as PalmCenter and FingerTips, and a FingerTip object also has a vector representing the direction that it is pointing. Handling this event it is then possible to determine with ray/line intersection algorithms which item the finger is pointing at.
I'd like to share a video I made at Mix, as well as also show one I found on YouTube of a part of the second day keynote.
The first I took and is of the MSR team using Kinect to drive the virtual telescope data. I love they way this shows flying through the solar system with just your hands:
The next is of the Kinect Drivable Lounge Chair:
I asked how much this cost to make and he said something like $30,000!