In Detecting the Kinect Devices, you
could see that the two sensors on the Kinect® for Windows® are represented by two device IDs, one for the color sensor and one of the depth
sensor. In that example, Device 1 is the color sensor and Device 2 is the depth sensor. This
example shows how to create a videoinput
object for the color sensor to
acquire RGB images and then for the depth sensor to acquire skeletal data.
Create the videoinput
object for the color sensor.
DeviceID
1 is used for the color sensor.
vid = videoinput('kinect',1,'RGB_640x480');
Look at the device-specific properties on the source device, which is the color sensor on the Kinect camera.
src = getselectedsource(vid); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = ColorSource Tag = Type = videosource Device Specific Properties: Accelerometer = [0.0 -1.0 0.0] AutoExposure = on AutoWhiteBalance = on BacklightCompensation = AverageBrightness Brightness = 0.2156 CameraElevationAngle = 3 Contrast = 1 ExposureTime = 1.0 FrameInterval = 0 FrameRate = 30 Gain = 0 Gamma = 2.2 Hue = 0 PowerLineFrequency = Disabled Saturation = 1 Sharpness = 0.5 WhiteBalance = 2700
As you can see in the output, the color sensor has a set of device-specific properties.
Device-Specific Property – Color Sensor | Description |
---|---|
Accelerometer | Returns 3D vector of acceleration data for both the color and depth sensors.
The data is updated while the device is running or previewing. This 1
x 3 double represents the
represents
values of |
AutoExposure | Use to set the exposure automatically. This control whether other related
properties are activated. Values are on (default) and
off .
|
AutoWhiteBalance | Use to enable or disable automatic white balance
setting.
|
BacklightCompensation | Configures backlight compensation modes to adjust the camera to capture
images dependent on environmental conditions. Note that this property
is only valid if Values are:
|
Brightness | Indicates the brightness level. The value range is 0.0 to
1.0 , and the default value is 0.2156 .
Note that this property is only valid if
|
CameraElevationAngle | Controls the angle of the sensor lens. This is the camera angle relative to the ground. The value must be an integer property with range of -27 to 27 degrees. The default value is the last set value, since this is a sticky setting. Only set it if you want to change the angle of the camera. This property is shared with the depth sensor also. |
Contrast | Indicates contrast level. Values must be in the range 0.5
to 2 , with a default value of 1 . |
ExposureTime | Indicates the exposure time in increments of 1/10,000 of a second. The value
range is 0 to 4000 , and the default is
0 . Note that this property is only valid if
|
FrameInterval | Indicates the frame interval in units of 1/10,000 of a second. The value
range is 0 to 4000 , and the default is
0 . Note that this property is only valid if
|
FrameRate | Frames per second for the acquisition. This property is read only and the
possible values for the color sensor are 12 ,
15 , and 30 (default). It reflects the
actual frame rate when running. |
Gain | Indicates a multiplier for the RGB color values. The value range is
1.0 to 16.0 , and the default is
1.0 . Note that this property is only valid if
|
Gamma | Indicates gamma measurement. Values must be in the range 1
to 2.8 , with a default value of 2.2 .
|
Hue | Indicates hue setting. Values must be in the range -22 to
22 , with a default value of 0 . |
PowerLineFrequency | Option for reducing flicker caused by the frequency of a power line. Values
are Disabled , FiftyHertz , and
SixtyHertz . The default is
Disabled .Note that this property is only valid
if |
Saturation | Indicates saturation level. Values must be in the range 0
to 2 , with a default value of 1 . |
Sharpness | Indicates sharpness level. Values must be in the range 0
to 1 , with a default value of 0.5 . |
WhiteBalance | Indicates color temperature in degrees Kelvin. The value range is
2700 to 6500 and the default is
2700 .Note that this property is only valid if
|
You can optionally set some of these properties shown in the previous step. For
example, you might be acquiring images in a low light situation. You could adjust the
acquisition for this by setting the BacklightCompensation
property to
LowLightsPriority
, which favors a low light level.
src.BacklightCompensation = 'LowLightsPriority';
Preview the color stream by calling preview
on the color sensor
object created in step 1.
preview(vid);
When you are done previewing, close the preview window.
closepreview(vid);
Create the videoinput
object for the depth sensor. Note that a
second object is created (vid2
), and DeviceID
2 is
used for the depth sensor.
vid2 = videoinput('kinect',2,'Depth_640x480');
Look at the device-specific properties on the source device, which is the depth sensor on the Kinect.
src = getselectedsource(vid2); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = DepthSource Tag = Type = videosource Device Specific Properties: Accelerometer = [0.0 -1.0 0.0] BodyPosture = Standing CameraElevationAngle = 4 DepthMode = Default FrameRate = 30 IREmitter = on SkeletonsToTrack = [1x0 double] TrackingMode = off
As you can see in the output, the depth sensor has a set of device-specific properties associated with skeletal tracking. These properties are specific to the depth sensor.
Device-Specific Property – Depth Sensor | Description |
---|---|
Accelerometer | Returns 3D vector of acceleration data for both the color and depth sensors.
The data is updated while the device is running or previewing. This 1
x 3 double represents the
represents
values of |
BodyPosture | Indicates whether the tracked skeletons are standing or sitting. Values are
Standing (gives 20 point skeleton data) and
Seated (gives 10 point skeleton data, using joint indices 2 -
11). Standing is the default.Note that if
See the subsection “BodyPosture Joint Indices” at the end of this example for the list of indices of the 20 skeletal joints. |
CameraElevationAngle | Controls the angle of the sensor lens. This is the camera angle relative to the ground. The value must be an integer property with range of -27 to 27 degrees. The default value is the last set value, since this is a sticky setting. Only set it if you want to change the angle of the camera. This property is shared with the color sensor also. |
DepthMode | Indicates the range of depth in the depth map. Values are
Default (range of 50 to 400 cm) and Near
(range of 40 to 300 cm). |
FrameRate | Frames per second for the acquisition. This property is read only and is
fixed at 30 for the depth sensor for all formats. It reflects
the actual frame rate when running. |
IREmitter | Controls whether the IR emitter is on or off. Values are
on and off . Initially, the default value
is on . However, this is a sticky property, so the default value
is the last set value. If you set it to off , it will remain off
in future uses until you change the setting. An advantage of this property is that it is useful when using multiple Kinect devices to avoid interference. |
SkeletonsToTrack | Indicates the Skeleton Tracking ID returned as part of the metadata. Values
are:
|
TrackingMode | Indicates tracking state. Values are:
Note that if |
Start the second videoinput
object (the depth stream).
start(vid2);
Skeletal data is accessed as metadata on the depth stream. You can use
getdata
to access it.
% Get the data on the object. [frame, ts, metaData] = getdata(vid2); % Look at the metadata to see the parameters in the skeletal data. metaData metaData = 10x1 struct array with fields: AbsTime: [1x1 double] FrameNumber: [1x1 double] IsPositionTracked: [1x6 logical] IsSkeletonTracked: [1x6 logical] JointDepthIndices: [20x2x6 double] JointImageIndices: [20x2x6 double] JointTrackingState: [20x6 double] JointWorldCoordinates: [20x3x6 double] PositionDepthIndices: [2x6 double] PositionImageIndices: [2x6 double] PositionWorldCoordinates: [3x6 double] RelativeFrame: [1x1 double] SegmentationData: [640x480 double] SkeletonTrackingID: [1x6 double] TriggerIndex: [1x1 double]
These metadata fields are related to tracking the skeletons.
MetaData | Description |
---|---|
AbsTime | This is a 1 x 1 double and represents the full timestamp, including date and time, in MATLAB clock format. |
FrameNumber | This is a 1 x 1 double and represents the frame number. |
IsPositionTracked | This is a 1 x 6 Boolean matrix of true/false values for the tracking of the
position of each of the six skeletons. A 1 indicates the
position is tracked and a 0 indicates it is not. |
IsSkeletonTracked | This is a 1 x 6 Boolean matrix of true/false values for the tracked state of
each of the six skeletons. A 1 indicates it is tracked and a
0 indicates it is not. |
JointDepthIndices | If the BodyPosture property is set to
Standing , this is a 20 x 2 x 6 double matrix of x-and
y-coordinates for 20 joints in pixels relative to the depth image, for the six
possible skeletons. If BodyPosture is set to
Seated , this would be a 10 x 2 x 6 double for 10 joints.
|
JointImageIndices | If the BodyPosture property is set to
Standing , this is a 20 x 2 x 6 double matrix of x-and
y-coordinates for 20 joints in pixels relative to the color image, for the six
possible skeletons. If BodyPosture is set to
Seated , this would be a 10 x 2 x 6 double for 10 joints.
|
JointTrackingState | This 20 x 6 integer matrix contains enumerated values for the tracking
accuracy of each joint for all six skeletons. Values
include:
|
JointWorldCoordinates | This is a 20 x 3 x 6 double matrix of x-, y- and z-coordinates for 20 joints,
in meters from the sensor, for the six possible skeletons, if the
BodyPosture is set to Standing . If it is
set to Seated , this would be a 10 x 3 x 6 double for 10
joints.See step 9 for the syntax on how to see this data. |
PositionDepthIndices | A 2 x 6 double matrix of X and Y coordinates of each skeleton in pixels relative to the depth image. |
PositionImageIndices | A 2 x 6 double matrix of X and Y coordinates of each skeleton in pixels relative to the color image. |
PositionWorldCoordinates | A 3 x 6 double matrix of the X, Y and Z coordinates of each skeleton in meters relative to the sensor. |
RelativeFrame | This 1 x 1 double represents the frame number relative to the execution of a trigger if triggering is used. |
SegmentationData | Image size double array with each pixel mapped to a tracked/detected skeleton, represented by numbers 1 to 6. This segmentation map is a bitmap with pixel values corresponding to the index of the person in the field-of-view who is closest to the camera at that pixel position. A value of 0 means there is no tracked skeleton. |
SkeletonTrackingID | This 1 x 6 integer matrix contains the tracking IDs of all six skeletons.
These IDs track specific skeletons using the SkeletonsToTrack
property in step 5.Tracking IDs are generated by the Kinect and change from acquisition to acquisition. |
TriggerIndex | This is a 1 x 1 double and represents the trigger the event is associated with if triggering is used. |
You can look at any individual property by drilling into the metadata. For example,
look at the IsSkeletonTracked
property.
metaData.IsSkeletonTracked ans = 1 0 0 0 0 0
In this case it means that of the six possible skeletons, there is one skeleton being tracked and it is in the first position. If you have multiple skeletons, this property is useful to confirm which ones are being tracked.
Get the joint locations for the first person in world coordinates using the
JointWorldCoordinates
property. Since this is the person in position
1, the index uses 1
.
metaData.JointWorldCoordinates(:,:,1) ans = -0.1408 -0.3257 2.1674 -0.1408 -0.2257 2.1674 -0.1368 -0.0098 2.2594 -0.1324 0.1963 2.3447 -0.3024 -0.0058 2.2574 -0.3622 -0.3361 2.1641 -0.3843 -0.6279 1.9877 -0.4043 -0.6779 1.9877 0.0301 -0.0125 2.2603 0.2364 0.2775 2.2117 0.3775 0.5872 2.2022 0.4075 0.6372 2.2022 -0.2532 -0.4392 2.0742 -0.1869 -0.8425 1.8432 -0.1869 -1.2941 1.8432 -0.1969 -1.3541 1.8432 -0.0360 -0.4436 2.0771 0.0382 -0.8350 1.8286 0.1096 -1.2114 1.5896 0.1196 -1.2514 1.5896
The columns represent the X, Y, and Z coordinates in meters of the 20 points on skeleton 1.
You can optionally view the segmentation data as an image.
% View the segmentation data as an image. imagesc(metaDataDepth.SegmentationData); % Set the color map to jet to color code the people detected. colormap(jet);
BodyPosture Joint Indices
The BodyPosture
property, in step 5, indicates whether the tracked
skeletons are standing or sitting. Values are Standing
(gives 20 point
skeleton data) and Seated
(gives 10 point skeleton data, using joint
indices 2 - 11).
This is the order of the joints returned by the Kinect adaptor:
Hip_Center = 1; Spine = 2; Shoulder_Center = 3; Head = 4; Shoulder_Left = 5; Elbow_Left = 6; Wrist_Left = 7; Hand_Left = 8; Shoulder_Right = 9; Elbow_Right = 10; Wrist_Right = 11; Hand_Right = 12; Hip_Left = 13; Knee_Left = 14; Ankle_Left = 15; Foot_Left = 16; Hip_Right = 17; Knee_Right = 18; Ankle_Right = 19; Foot_Right = 20;
When BodyPosture
is set to Standing
, all 20 indices
are returned, as shown above. When BodyPosture
is set to
Seated
, numbers 2 through 11 are returned, since this represents the
upper body of the skeleton.
Note
To understand the differences in using the Kinect adaptor compared to previous toolbox adaptors, see Important Information About the Kinect Adaptor. For information about Kinect device discovery and the use of two device IDs, see Detecting the Kinect Devices. For an example of simultaneous acquisition, see Acquiring from Color and Depth Devices Simultaneously.