Keywords

1 Introduction

Our motivation for a data collection framework is to enable a community based effort to collect driving data in India. Challenges in this ecosystem are high cost and steep learning curve of technical know-how. A low cost off-the-shelf solution used as-is falls short to meet reliability, quality, performance and real-time requirements. There are several proposed systems (Table 1) for real time data collection. But, as we can see (Table 4) cost of those systems are quite prohibitive for a developing economy. To address this challenge, we have created a recipe for a reference hardware and the associated software framework which is scalable in performance and minimizes initial capital investment. Also, the stack is designed to achieve maximum throughput possible in a commercial automotive grade system with real time constraints.

1.1 Related and Prior Work

Table 1 provides a list of related data collection frameworks. Most of the frameworks use monocular cameras [4, 9, 10]. And the ones that use stereo support a maximum of 2 instances [5]. Our goal is to capture surround stereo camera data to enable stereo based algorithm development which needs minimum of 4 stereo cameras. Available published frameworks describe use of high end servers for computation. Ford Campus Vision [9] uses four 2U servers with quad-core processors; LISA-A [10] uses two servers with 4 Xeon processor with a total of 32 threads each. Our work, DDCFFootnote 1 stands apart in the usage of low cost compute without compromising the state-of-the-art benchmark in-terms of sampling rate and data resolution. There is no existing robust data collection system using compute which costs less than 1000$. This is the prime motivation for this work.

Table 1. Comparison of related data collection frameworks [from LISA-A[10]]

The rest of the paper is organized as follows: In Sect. 2, we discuss the challenges faced in designing a system for our target community. Section 3 discusses about the system design and architecture. Section 4 discusses the system configuration and sensor suite used. Section 5 discusses the shortcomings and scope for improvement.

2 Challenges

Based on our experience, the community was apprehensive of investing a huge capital upfront and were more inclined towards incremental upgradesFootnote 2 to their data collection rig. Field engineers using the data collection vehicle faced challenges in configuring, running and maintaining the system. They expect minimal pre-flight checks, consistency, repeatability and reliability. The system also have to support high data bandwidth sensors (cameras etc.) as well as synchronization among them. The basic requirements of such a system are

  1. 1.

    Scalability in-terms of performance, number of sensors and cost.

  2. 2.

    High resolution 1080p cameras at 30 fps

  3. 3.

    Uncompressed sensor data (E.g. YUYV, RGB, Point Cloud)

  4. 4.

    Synchronization of multimodal sensors: GPS, IMU, LIDAR and Camera.

Fig. 1.
figure 1

Sensor layout: an electric car with 4 stereo cameras, 1 LIDAR and 1 GPS/IMU

Apart from the support of multiple sensors, the selection of a framework or middle-ware for such a system is a major challenge. Among many options ROSFootnote 3 is the most suitable choice instead of a complete grounds up implementation. But deploying ROS in a low cost platform had performance issues such as using rosbag record to write image data to disk involves encoding of the image buffer, multiple in-memory copies and serialization. This degrades performance in terms of frames per second.

3 Driving Data Collection Framework

In this section we describe the proposed framework. We describe about the architecture and the design of the system. We also elaborate on the optimization approaches we took.

3.1 Architecture and Design

Scalability. We wanted scalability in terms of performance and cost as the most important criterion for designing the system. We leverage ROS’s distributed architecture to connect multiple low cost hosts across a network hub. If a host H1 has maxed out its I/O bandwidth and compute with a set of sensors S1, to add more sensors, a new host H2 with additional set of sensors S2 can be connected using an Ethernet hub. New hosts Hn with sensors Sn can be added as needed (Table 2). This enables low cost incremental scalability. Several changes and enhancements are made to the standard ROS framework to meet our requirements. These are described below.

Table 2. Scalable configurations possible with DDCF

Messaging Architecture. The first problem we faced with the setup is the data throughput. ROS has a distributed message passing framework. It allows to run different processes/threads independently for capturing data. But, this approach has a limitation especially in data heavy sensors like cameras. In standard ROS messaging system involves multiple copies and a serialization and de-serialization, which introduces latency. To overcome this issue of ROS we introduced a modified message parsing approach. In this approach we separated the data and the metadata. Only the metadata is published as ROS messages while every image frame is written directly to disk as binary files. The published metadata of every frame is recorded as ROS bags. This approach is depicted in the Figs. 2 and 3.

Fig. 2.
figure 2

Stage 1: capture - data is written to SSD as raw binary files. Metadata is published as ROS messages.

Fig. 3.
figure 3

Stage 2: consolidation - raw binary files are converted to coherent stream using metatdata and published as a single ROS bag.

Two Stage Pipeline. A data collection system demands uninterrupted data capture without information loss. A standard single stage pipeline design consisting of capture and record did not meet the real time performance we needed from the system. Hence the pipeline is broken into two stages. First stage is capture, all raw sensor data is written to disk as binary files with appropriate metadata recorded as ROS bags.

Second stage is consolidation, which is run offline to build a coherent data stream i.e., combine the raw data from binary files and metadata from ROS bags and output a single ROS bag as a final output.

Filesystem. Filesystems plays a major role in data throughput especially in real time systems. In our scenario, we wanted to use a standard file system which can meet our throughput requirements. We experimented with different file systems that are available in Linux. With emphasis on ease of use, we consciously avoided SSD specific file systems even though they have higher throughput. As shown in Table 3, BtrfsFootnote 4 has the highest write throughput for an application such as this.

Table 3. Comparison of file systems performance

Latency Optimization. The data capture pipeline was optimized by using a zero in-memory copy approach for persisting raw data which is very similar to the approach adopted in several other ROS based implementations for autonomous driving e.g., Apollo BaiduFootnote 5. We were able to get near real-time performance without going for a strict real time OS or bare-metal embedded system.

3.2 Sensor Calibration

Intrinsics and Stereo Calibration. Intrinsics of cameras is calibrated using Zhang’s [11] checkerboard pattern approach. And stereo calibration is performed using OpenCV tools.

Extrinsics for Non-overlapping Field of View. For extrinsic calibration between cameras with non-overlapping field of view, we use a modified version of Pagel [8] using AprilTag [7]. Using AprilTag array instead of checker board pattern improved repeatability. Since the tag array can be uniquely identified, calibration of fixed targets needs to be done only once.

Extrinsics of Camera and a LIDAR. For extrinsic calibration of camera and a LIDAR, we used intermediate results of Dhall et al. [3]. It was challenging to calibrate a 16 line LIDAR with a stereo camera because of sparse point cloud. Multiple iterations were performed to reduce error.

4 System Configuration

The compute hardware is a low cost setup such as an Intel NUC. Since the application demands a high disk write throughput, we recommend using SSD with write speed of 520 MB/s for storage. RAIDFootnote 6 is desirable but not mandatory as it increases cost. If a compute host has multiple disks, the disk I/O is balanced after experimentation by assigning specific disks to sensor nodes. Optimal I/O loading is capped at 80% bandwidth of disk’s write capability.

Our hardware is composed of

  • Intel Core i5 Processor

  • 4 \(\times \) USB 3.0

  • Ethernet

  • \(1\times 8\) GB RAM

  • \(2\times 1\) TB SSD

This hardware specification costs about 900 USD (Fig. 4). The compute is powered by the electric car’s battery. Using a low powered compute, the electric car’s range was extended by 50%Footnote 7. Table 4 shows comparison with similar data collection frameworks and estimated costFootnote 8.

Fig. 4.
figure 4

An example of low cost data collection hardware kit built using the framework. From top left to right: a suction mount, DC voltage regulator, low cost compute, GPS and stereo camera

Table 4. Cost comparison of related data collection frameworks

Figure 1 shows our sensor layout on a electric car. Our test vehicle (Fig. 5) has the following sensors:

  • 4 \(\times \) Zed Stereo CamerasFootnote 9

  • 1 \(\times \) VLP 16 LIDAR

  • 1 \(\times \) Advanced Navigation Spatial GPS and IMU

Fig. 5.
figure 5

An electric car mounted with stereo cameras, LIDAR, GPS and IMU

Software stack has been chosen with readily available components:

  • Ubuntu 16.04 LTS (in run level 3)

  • ROS Kinetic

  • Sensor nodes from open source community.

  • New ROS messages for managing data and meta-data in disk [6]

  • Tools for consolidation of data from different streams [6].

5 Conclusion and Future Work

Our framework is designed for use in capturing of data for any driving scenarios and is being improved continuously as an opensource project [6].

Using USB cameras, synchronization between cameras was not possible. We plan to add support for more sensors like PCIe based cameras etc.

Processing of raw data from capture has to be done offline. A live second stage consolidation node to perform lazy consolidation of data during capture will extend the capture time and optimize the use of available storage.

We are working towards our goal of creating an approachable recipe in terms of time, effort and cost for anybody to create a data collection rig using low cost hardware. We believe this will enable a wider participation of community in creation of datasets for autonomous driving research.