Persönlicher Status und Werkzeuge

File Formats

We provide the RGB-D datasets from the Kinect in the following format:

Color images and depth maps

We provide the time-stamped color and depth images as a gzipped tar file (TGZ).

  • The color images are stored as 640×480 8-bit RGB images in PNG format.
  • The depth maps are stored as 640×480 16-bit monochrome images in PNG format.
  • The color and depth images are already pre-registered using the OpenNI driver from PrimeSense, i.e., the pixels in the color and depth images correspond already 1:1.
  • The depth images are scaled by a factor of 5000, i.e., a pixel value of 5000 in the depth image corresponds to a distance of 1 meter from the camera, 10000 to 2 meter distance, etc. A pixel value of 0 means missing value/no data.

Ground-truth trajectories

We provide the groundtruth trajectory as a text file containing the translation and orientation of the camera in a fixed coordinate frame. Note that also our automatic evaluation tool expects both the groundtruth and estimated trajectory to be in this format.

  • Each line in the text file contains a single pose.
  • The format of each line is 'timestamp tx ty tz qx qy qz qw'
  • timestamp (float) gives the number of seconds since the Unix epoch.
  • tx ty tz (3 floats) give the position of the optical center of the color camera with respect to the world origin as defined by the motion capture system.
  • qx qy qz qw (4 floats) give the orientation of the optical center of the color camera in form of a unit quaternion with respect to the world origin as defined by the motion capture system.
  • The file may contain comments that have to start with ”#”.

Intrinsic Camera Calibration of the Kinect

The Kinect has a factory calibration stored onboard, based on a high level polynomial warping function. The OpenNI driver uses this calibration for undistorting the images, and for registering the depth images (taken by the IR camera) to the RGB images. Therefore, the depth images in our datasets are reprojected into the frame of the color camera, which means that there is a 1:1 correspondence between pixels in the depth map and the color image.

The conversion from the 2D images to 3D point clouds works as follows. Note that the focal lengths (fx/fy), the optical center (cx/cy), the distortion parameters (d0-d4) and the depth correction factor are different for each camera. The Python code below illustrates how the 3D point can be computed from the pixel coordinates and the depth value:

fx = 525.0  # focal length x
fy = 525.0  # focal length y
cx = 319.5  # optical center x
cy = 239.5  # optical center y

factor = 5000 # for the 16-bit PNG files
# OR: factor = 1 # for the 32-bit float images in the ROS bag files

for v in range(depth_image.height):
  for u in range(depth_image.width):
    Z = depth_image[v,u] / factor;
    X = (u - cx) * Z / fx;
    Y = (v - cy) * Z / fy;

Note that the above script uses the default (uncalibrated) intrinsic parameters. The intrinsic parameters for the Kinects used in the fr1 and fr2 dataset are as follows:

Calibration of the color camera

We computed the intrinsic parameters of the RGB camera from the rgbd_dataset_freiburg1/2_rgb_calibration.bag.

Camera fx fy cx cy d0 d1 d2 d3 d4
(ROS default) 525.0 525.0 319.5 239.5 0.0 0.0 0.0 0.0 0.0
Freiburg 1 RGB 517.3 516.5 318.6 255.3 0.2624 -0.9531 -0.0054 0.0026 1.1633
Freiburg 2 RGB 520.9 521.0 325.1 249.7 0.2312 -0.7849 -0.0033 -0.0001 0.9172
Freiburg 3 RGB 535.4 539.2 320.1 247.6 0 0 0 0 0

Note that both the color and IR images of the Freiburg 3 sequences have already been undistorted, therefore the distortion parameters are all zero. The original distortion values can be found in the tgz file.

Note: We recommend to use the ROS default parameter set (i.e., without undistortion), as undistortion of the pre-registered depth images is not trivial.

Calibration of the depth images

We verified the depth values by comparing the reported depth values to the depth estimated from the RGB checkerboard. In this experiment, we found that the reported depth values from the Kinect were off by a constant scaling factor, as given in the following table:

Camera ds
Freiburg 1 Depth 1.035
Freiburg 2 Depth 1.031
Freiburg 3 Depth 1.000

Note: We already pre-scaled the depth images of all sequences accordingly, so that no action on your side is necessary.

Calibration of the infrared camera

We also provide the intrinsic parameters for the infrared camera. Note that the depth images provided in our dataset are already pre-registered to the RGB images. Therefore, rectifying the depth images based on the intrinsic parameters is not straight forward.

Camera fx fy cx cy d0 d1 d2 d3 d4
Freiburg 1 IR 591.1 590.1 331.0 234.0 -0.0410 0.3286 0.0087 0.0051 -0.5643
Freiburg 2 IR 580.8 581.8 308.8 253.0 -0.2297 1.4766 0.0005 -0.0075 -3.4194
Freiburg 3 IR 567.6 570.2 324.7 250.1 0 0 0 0 0

Note that both the color and IR images of the Freiburg 3 sequences have already been undistorted, therefore the distortion parameters are all zero. The original distortion values can be found in the tgz file.

Movies for visual inspection

For visual inspection of the individual datasets, we also provide movies of the Kinect (RGB and depth) and of an external camcorder. The movie format is mpeg4 stored in an AVI container.

Alternate file formats

ROS bag

For people using ROS, we also provide ROS bag files that contain the color images, monochrome images, depth images, camera infos, point clouds and transforms – including the groundtruth transformation from the /world frame all in a single file. The bag files (ROS diamondback) contain the following message topics:

  • /camera/depth/camera_info (sensor_msgs/CameraInfo) contains the intrinsic camera parameters for the depth/infrared camera, as reported by the OpenNI driver
  • /camera/depth/image (sensor_msgs/Image) contains the depth map
  • /camera/rgb/camera_info (sensor_msgs/CameraInfo) contains the intrinsic camera parameters for the RGB camera, as reported by the OpenNI driver
  • /camera/rgb/image_color (sensor_msgs/Image) contains the color image from the RGB camera
  • /imu (sensor_msgs/Imu), contains the accelerometer data from the Kinect
  • /tf (tf/tfMessage), contains:
    • the ground-truth data from the mocap (/world to /Kinect)
    • the calibration betwenn mocap and the optical center of the Kinect's color camera (/Kinect to /openni_camera),
    • and the ROS-specific, internal transformations (/openni_camera to /openni_rgb_frame to /openni_rgb_optical_frame).

If you need the point clouds and monochrome images, you can use the adding_point_clouds_to_ros_bag_files script to add them:

  • /camera/rgb/image_mono (sensor_msgs/Image) contains the monochrome image from the RGB camera
  • /camera/rgb/points (sensor_msgs/PointCloud2) contains the colored point clouds
  • /camera/depth/points (sensor_msgs/PointCloud2) contains the point cloud

Mobile Robot Programming Toolkit (MRPT)

Jose Luis Blanco has added our dataset to the mobile robot programming toolkit (MRPT) repository. The dataset (including example code and tools) can be downloaded here.

Last edited 26.01.2013 12:07 by sturmju