The collected datasets are each contained in a folder, named after a random hash, for example 71de12f9. A dataset folder has the following directory structure:

camera_matrix.csv
odometry.csv
depth/
  - 000000.npy
  - 000001.npy
  - ...
rgb.mp4

rgb.mp4 is an HEVC encoded video, which contains the recorded data from the iPhone's camera.

The depth/ directory contains the depth maps. One .npy file per rgb frame. Each of these is a Numpy matrix file containing uint16 values. They have a height of 192 elements and width of 256 elements. The values are the measured depth in millimeters, for that pixel position. These can be loaded using Numpy using the np.load function.

The camera_matrix.csv contains a 3 x 3 matrix containing the camera intrinsic parameters.

$\begin{bmatrix}f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 0 \end{bmatrix}$

The odometry.csv file contains the camera positions for each frame. The first line is a header. The following lines, one line per frame, contains the estimated camera position relative to the global coordinate frame. Each line has 7 values. The first 3 encode the x, y, z position of the camera in meters. The last 4, represent the quaternion rotation of the camera frame. The quaternion values are ordered qx, qy, qz, w, where qx, qy, qz are the imaginary values corresponding to each axis and w is the real component. The coordinate system is a right handed one with the z axis pointing forward in the camera direction. The x axis pointing towards the bottom of the phone where lightning cable is. The y axis points toward the left edge of the phone when it is in portrait mode.

Here is a link to a script that visualizes the collected data.