Introduction

The motivation of this study was to enable the autonomy of firefighter robots at petrochemical complexes. Petrochemical complexes in fire disasters pose a high environmental risk because large fires and explosions can cause injuries, fatalities, and devastation. The use of firefighter robots can reduce the risk to firefighters. Such a system is comprised of several vehicles, such as a water-shooting robot, a hose-extending robot, and an exploration robot. An autonomous capability facilitates their control and enables many robots to be controlled by only a few operators.

One key technology for autonomous firefighter robots is simultaneous localization and mapping (SLAM), which is required because petrochemical complexes are restricted areas and there are limited opportunities to update their maps. In addition, the maps are dynamically changed in real time with developments such as fires, explosions, moving firefighters and mobile trucks. Therefore, Fast-SLAM is used in such an environment.

In this work, two dimensional Fast-SLAM is used for a firefighter robot in petrochemical complexes. The road of a petrochemical complex has no difference in the ground level because the road is built to accommodate the large firefighter trucks and large water shooting nozzles. In this environment, we can develop a two-dimensional Fast-SLAM without considering ground level differences. Long-range light detection and ranging (LIDAR) sensors are used with Fast-SLAM to develop a map of petrochemical complexes.

Petrochemical complexes have a sparse layout of very large oil tanks, where distances between oil tanks are 40–80 m. LIDAR is suitable because it can measure distances of 0–100 m with high accuracy (e.g., Velodyne HDL-32E ± 0.02 m). Consequently, a map with a sparse oil tank layout can be built by SLAM.

Our procedure using Fast-SLAM for the autonomy of a firefighter robot is explained here. The purpose of Fast-SLAM is to build a map for facilitating autonomy. First, Fast-SLAM is used to construct a map in a normal environment by human firefighters manually controlling the robot. Next, using the Fast-SLAM constructed map, human firefighters set a target point for the robot and the robot autonomously moves to the target by localizing its position using the map. Therefore, the autonomy of firefighter robots can be realized by using the Fast-SLAM map.

The main purpose of this research is map consistency because a map can be used not only by firefighter robots, but also by human firefighters who operate these robots. The use of GPS helps increase the consistency. In this study, we used GPS and LIDAR data to build the map. GPS and LIDAR provided the heterogeneous data; their combination therefore required consideration for the map consistency.

This paper describes a method that employs two Rao-Blackwellized particle filters (RBPFs). The method is based on GPS and LIDAR for map consistency in petrochemical complexes. Fast-SLAM 1.0 (FS 1.0) and Fast-SLAM 2.0 (FS 2.0) by Grisetti et al. both in a grid map, were used for the RBPFs [1]. Their weight functions were revised for the RBPFs based on the GPS and LIDAR sensor for map consistency in petrochemical complexes. GPS and LIDAR data have heterogeneous characteristics. These sensor data are complementary and can be used to improve the accuracy and consistency of the resulting map. Therefore, we propose the use of Fast-SLAM to combine GPS and LIDAR. The difference between the original Fast-SLAM and the proposed method is the use of the log-likelihood function of GPS; the proposed combination method is implemented using probabilistic mathematics formulation.

Figure 1 shows the result of FS 2.0 based on GPS and LIDAR using the proposed weight function. The upper figure shows an aerial image of the petrochemical complex; the lower figure is three-dimensional (3D) point cloud data reconstructed by our proposed FS 2.0 based on GPS and LIDAR.

Fig. 1
figure 1

Result of petrochemical complex (380–550 m) consistency: map for Fast-SLAM 2.0 in a grid map based on GPS and LIDAR (bottom) compared to an aerial map (top) [2]

The remainder of this paper is organized as follows. In “Related works” section, review on related works are described. In “Simultaneous localization and mapping to maintain consistency in large areas” section, the proposed method is formulated. In “Evaluation” section, we describe the experiments conducted for evaluating the map consistency estimated by our method. In “Discussion” section, the experimental results are presented. In “Conclusion” section, our conclusions are provided.

Related works

There are two types of SLAM. The first includes filter SLAM, such as RBPFs [3], extended Kalman filter (EKF) [4], sparse information filter [5], and topological/hierarchical filter [6, 7]. The other type includes batched SLAM, such as respective graph-based [8], square-root-based [9], sparse-pose-adjustment (SPA) [10], and incremental-smoothing-and-mapping (iSAM) methods [11]. This research used filter SLAM, which is an RBPFs because of limited processing power. The filter SLAM showed better results in the case of limited processing power [12].

RBPFs were first introduced by Doucet et al. [13]. Montemerlo et al. extended RBPFs into Fast-SLAM 1.0 by using EKF for landmark feature representation [14]. Later, Montemerlo et al. extended it into a faster RBPFs, specifically, Fast-SLAM 2.0 (FS 2.0) [14]. Their research determined that the use of scan-matching results for proposal distribution could increase Fast-SLAM 1.0 speed and accuracy. Furthermore, Grisetti et al. formulated Fast-SLAM 2.0 in a dense grid map environment [1]. Therefore, FS 1.0 and FS 2.0 were selected for the RBPFs SLAM. To increase the SLAM consistency, sensors with high consistency, such as GPS, could be used.

Several researchers used GPS to increase map consistency. For example, Gamma-SLAM uses GPS to improve the camera-based RBPF results [15]. In the first step, the map and trajectory are estimated by RBPFs. The map and trajectory are then aligned using GPS data to minimize the error between the GPS and RBPF trajectories. Singular value decomposition (SVD) and a batch algorithm are used. In addition, Schleicher proposed hierarchical SLAM [16]. For low-level sub-maps, a wide-angle-camera-based EKF SLAM was fused with GPS for consistency. Each sub-map was then combined by using batch-algorithm multi-level relaxation (MLR) based on GPS to increase the global consistency of the combined map. Our work fuses GPS data with LIDAR inside RBPFs by the proposed weight function for RBPFs, which fuses LIDAR and GPS. This method does not require an additional batch SLAM. GPS and LIDAR are heterogeneous sensors; therefore, the correct sensor fusion is required to achieve a complementary result.

The method for fusion of GPS with LIDAR was proposed by several researchers. Wei et al. used a normalized innovation squared (NIS) method to evaluate each camera/laser/GPS sensor validity before fusing the sensors with an unscented information filter for localization [17]. Soloviev used a Kalman filter with GPS, inertial measurements, and LIDAR [18]. Hentsche et al. used a Kalman filter to integrate GPS and the inertia measurement. For position estimation, the Kalman filter result was added to Monte Carlo localization by replacing 10% of the overall particles with the lowest weight [19]. Our present research aims to achieve sensor fusion of GPS and LIDAR with a complementary effect. To this end, the proposed method employs RBPFs with an importance weight function for the fusion.

Fusion of GPS with an RBPF importance function was previously proposed. Fusion of GPS and IMU by the Kalman filter for RBPF particle reweighting was used in [20, 21]. These RBPFs are similar to FS 1.0 in that sensors are fused by EKF. Ren et al. used the Kalman filter to separately estimate GPS and IMU [20]. Depending on the sensor availability, only the sensor will be updated to the particle filter. Fusion of sensors is successful because the particle filter follows a Bayesian rule. Both approaches employ fusion data using a method similar to FS 1.0 for a landmark-based environment. Our approach focuses on fusion of GPS and LIDAR for FS 1.0 and FS 2.0 for grid-map environments.

Simultaneous localization and mapping to maintain consistency in large areas

We herein propose RBPFs based on GPS and LIDAR to maintain map consistency. GPS and LIDAR sensor data are complementary. SLAM based on LIDAR uses scan matching to localize the robot positions. A characteristic of LIDAR scan matching is local accuracy in management and oil refinery building areas in which several buildings are located near a road. However, scan matching increases errors in areas with large oil tanks, where tanks and building layouts are sparse (more than 80 m between structures). On the other hand, a characteristic of GPS data is that they are globally absolute and accurate in areas including large-oil-tank areas, especially where there is high satellite availability. The errors in GPS measurements increase in management buildings and oil refinery areas because of satellite signal diffraction. For effective fusion, complementary sensors, which are GPS and LIDAR, are used for map consistency. We propose two extension methods of RBPF based on GPS and LIDAR using a dense grid map. FS 1.0 and FS 2.0, both renowned methods for RBPFs, were used, as formulated by Grisetti et al. [1]. To achieve a complementary effect on sensor fusion of RBPF based on GPS and LIDAR, probabilistic mathematics was used to combine GPS and LIDAR data in RBPFs importance weight \(w_{t}^{(i)}.\) For implementation, the RBPFs importance weight was formulated using the log-likelihood.

RBPFs based on GPS and LIDAR

Douce et al. showed that RBPFs are an effective solution for SLAM by factorizing the states of maps and paths [13]. RBPFs use importance resampling to prevent correct particles from being resampled away. To obtain sensor fusion, a probabilistic mathematical formulation is used; in particular, the formulation of importance weight \(w^{(i)}_{t},\) mentioned above, is proposed to handle the GPS and LIDAR data. Figure 2 shows the graphical model of our proposed RBPFs, which contains two observations \(z_{\mathrm {GPS}~t}\) and \(z_{\mathrm {LIDAR}~t}\) as parts of observation \(z_{t}.\)

Formulation of RBPFs

Fig. 2
figure 2

Graphical model of SLAM based on GPS and LIDAR

RBPFs estimate robot pose \(x_{1:t}\) and map m from the sensor data, including motion \(u_{1:t}\) and observation \(z_{1:t}.\) Equation (1) shows the factorization of RBPFs [22]:

$$\begin{aligned}&p(x_{1:t},m\mid z_{1:t},u_{1:t-1}) \nonumber \\&\qquad \qquad =\underbrace{p(x_{1:t} \mid m,z_{1:t},u_{1:t-1})}_\text {Target distribution}\cdot \underbrace{p(m \mid x_{1:t},z_{1:t})}_\text {Map}. \end{aligned}$$
(1)

The main idea of RBPF is that SLAM problems are divided into pose estimation, \(p(x_{1:t}|m,z_{1:t},u_{1:t-1}),\) and map estimation, \(p(m|x_{1:t},z_{1:t}).\) Here, the target distribution is \(p(x_{1:t}|m,z_{1:t},u_{1:t-1}).\) A particle filter is used to estimate the target distribution. The map is built based on particle positions. Therefore, the target distribution is modified for the sensor fusion of GPS and LIDAR data. The target distribution in Eq. (1) can be divided into an importance distribution and a proposal distribution by the Bayesian rule and Markov assumption, as shown in Eq. 2.

$$\begin{aligned}&\underbrace{p(x_{1:t} \mid m,z_{1:t},u_{1:t-1})}_\text {Target distribution} \nonumber \\&=\gamma \underbrace{p(z_{1:t} \mid m, x_{1:t},z_{1:t-1},u_{1:t-1})}_\text {Importance factor} \underbrace{p(x_{1:t} \mid m, z_{ 1:t-1},u_{1:t-1})}_\text {Proposal distribution}. \end{aligned}$$
(2)

Here, \(\gamma = 1/p(z_{t} \mid z_{1:t-1},u_{1:t-1})\) is a normalizer. Thrun showed that the importance factor in Eq. (2) is interpreted as a set of particles, \(w^{(i)}_{t},\) weighted by observation \(z_{1:t}.\) The normalizer is omitted in the weight of importance, \(w^{(i)}_{t},\) as the weight of importance is a marginal probability [23] given by:

$$\begin{aligned} w^{(i)}_{t} ~=~ \frac{\text {Target distribution}}{\text {Proposal distribution}}. \end{aligned}$$
(3)

where an importance weight is assigned to each particle (i). The importance weight can be calculated by using a recursive formula [24]. This involves recursive multiplication of the previous weight, \(w^{(i)}_{t-1},\) with the current observation, \(p(z_{t})\). It thus becomes:

$$\begin{aligned} w^{(i)}_{t} = \frac{p(z_{t} \mid m^{(i)}_{t-1}, x^{(i)}_{t})p(x^{(i)}_{t} \mid x^{(i)}_{t-1},u_{t-1})}{p(x_{t} \mid m^{(i)}_{t-1},x^{(i)}_{1:t-1}, z_{1:t},u_{1:t-1})} \cdot w^{(i)}_{t-1}. \end{aligned}$$
(4)

Equation (4) uses an observation sensor. In this work, the observation sensors were expanded to fuse GPS and LIDAR.

Formulation of RBPFs based on GPS and LIDAR

Our proposed fusion of GPS and LIDAR is a conditionally independent sensor fusion. GPS and LIDAR independence is shown in the Markov graphic in Fig. 2. No arrow indicates the dependence between GPS and LIDAR. In addition, GPS has no dependence on the map, as shown in Fig. 2. We propose the weight of importance equation using the Bayesian rule and Markov assumption is implemented for the two RBPFs of FS 1.0 in Eq. (7) and FS 2.0 based on GPS and LIDAR in Eq. (9). Given that observation z is

$$\begin{aligned} z= \{ {z_{\mathrm {gps}},z_{\mathrm {ldr}}} \} , \end{aligned}$$
(5)

then, using the Bayesian rule and Markov assumption, the weight of importance \(w_{t}^{(i)}\) equation for RBPFs based on GPS and LIDAR in Eq. (4) becomes

$$\begin{aligned} w^{(i)}_{t} = w^{(i)}_{t-1} \cdot \frac{p(z_{\mathrm {gps}~t}\mid x^{(i)}_{t})p(z_{\mathrm {ldr~t}}\mid m^{(i)}_{t-1}, x^{(i)}_{t})p(x^{(i)}_{t} \mid x^{(i)}_{t-1},u_{t-1})}{p(x_{t} \mid m^{(i)}_{t-1},x^{(i)}_{t-1}, z_{t},u_{1:t-1})}. \end{aligned}$$
(6)

We formulate our GPS and LIDAR fusion on both FS 1.0 and FS 2.0 because both FS 1.0 and FS 2.0 are renowned RBPFs.

FS 1.0 uses an odometry-based motion model, \(p(x_{t} \mid x_{t-1},u_{t-1}).\) This model is used as a proposal distribution (denominator) for Eq. (6). Hence, the weight function becomes

$$\begin{aligned} w^{(i)}_{t} ~=~ p(z_{\mathrm {gps}~t}| x^{(i)}_{t})p(z_{\mathrm {ldr}~t}|x^{(i)}_{t},m^{(i)}_{t-1}) w^{(i)}_{t-1}. \end{aligned}$$
(7)

The FS 2.0 proposal distribution uses odometry with a recently reported scan-matching-based motion model, \(p(x_{t} \mid m^{(i)}_{t-1},x^{(i)}_{t-1},z_{\mathrm {ldr}~t},u_{t-1})\) [1]. FS 2.0 provides significantly better accuracy when odometry with the scan-matching based motion model is used. The scan-matching-based motion model shows more accurate results than when using odometry alone. For this purpose, odometry with the scan-matching-based motion model is used as a proposal distribution (denominator) for Eq. (6), whereby the weight function becomes

$$\begin{aligned}&p(x_{t} \mid m^{(i)}_{t-1} ,x^{(i)}_{t-1},z_{\mathrm {ldr}~t},u_{t-1})\nonumber \\&~=~ \frac{p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1},x_{t})p(x_{t} \mid x^{(i)}_{t-1},u_{t-1})}{p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1},x^{(i)}_{t-1},u_{t-1})}. \end{aligned}$$
(8)

Thus, combining Eqs. (8 and 6) gives

$$\begin{aligned}&w^{(i)}_{t} ~=~ w^{(i)}_{t-1} \nonumber \\&\cdot \frac{p(z_{\mathrm {gps}~t} \mid x^{(i)}_{t})p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1}, x^{(i)}_{t})p(x^{(i)}_{t} \mid x^{(i)}_{t-1},u_{t-1})}{\frac{p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1},x_{t})p(x_{t} \mid x^{(i)}_{t-1},u_{t-1})}{p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1},x^{(i)}_{t-1},u_{t-1})}} \end{aligned}$$
(9)

and

$$\begin{aligned} w^{(i)}_{t} ~= & {} ~ w^{(i)}_{t-1} p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1},x^{(i)}_{t-1},u_{t-1}) \nonumber \\&\cdot p(z_{\mathrm {gps}~t} \mid x^{(i)}_{t}) . \end{aligned}$$
(10)

Based on the total probability, the term \(p(z_{\mathrm {ldr}~t }\mid m^{(i)}_{t-1},x^{(i)}_{t-1},u_{t-1})\) becomes \(\sum ^{K}_{j=1} p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1},x_{j}) \cdot p(x_{j} \mid x^{(i)}_{t-1},u_{t-1}),\) where j is the potential pose of the robot and K is the number of potential robot poses. Therefore, the new \(w^{(i)}_{t}\) becomes

$$\begin{aligned} w^{(i)}_{t} ~= & {} ~w^{(i)}_{t-1} \sum ^{K}_{j=1}p(z_{\mathrm {ldr}~t} \mid m^{(i)}_{t-1},x_{j}) \cdot p(x_{j} \mid x^{(i)}_{t-1},u_{t-1}) \nonumber \\&\cdot p(z_{\mathrm {gps}~t} \mid x^{(i)}_{t}) . \end{aligned}$$
(11)

Implementation of balance weight fusion

For implementation, the log-likelihood function is used to calculate the weights for both (a) FS 1.0 based on GPS and LIDAR and (b) FS 2.0 based on GPS and LIDAR. This is because the log-likelihood calculation is more efficient owing to its use of summation rather than multiplication.

For FS 1.0 based on GPS and LIDAR implementation, the LIDAR likelihood is calculated using the beam model [23]. The GPS likelihood is the Gaussian error of the current estimated motion model position to the GPS observation position. The weight function becomes

$$\begin{aligned} w^{(i)}_{t} ~= & {} ~ w^{(i)}_{t-1} \frac{1}{\sqrt{2\pi \sigma ^{2}_{\mathrm {ldr}}}}\frac{1}{N}\sum ^{N}_{n=1} e^{\frac{-{(z^{(i)}_{\mathrm {ldr} ~n,t}-\acute{z}^{(i)}_{ldr~n,t})}^{2}}{2\sigma ^{2}_{\mathrm {ldr}}}}\nonumber \\&\times\frac{1}{\sqrt{2\pi \sigma ^{2}_{\mathrm {gps}}}} e^{\frac{-{(x^{(i)}_{t}-{z}_{\mathrm {gps}~t})}^{2}}{2\sigma ^{2}_{\mathrm {gps}}}}, \end{aligned}$$
(12)

where \(x^{(i)}_{t}\) is the pose reported by the odometry-based motion model, and \(z_{\mathrm {gps}~t}\) is the observation pose for GPS. \(z^{(i)}_{\mathrm {ldr}~n,t}\) is the measurement from pose \(x^{(i)}_{t},\) and \(\acute{z}^{(i)}_{\mathrm {ldr}~n,t}\) is the true measurement on map\((x,y)^{T}.\) \(\sigma _{\mathrm {gps}}\) is the GPS measurement error and \(\sigma _{\mathrm {ldr}}\) is the LIDAR scan-matching measurement error. N is the total number of LIDAR laser beams and n is the identification number for a laser beam. The log-likelihood importance weight becomes

$$\begin{aligned} \hat{w}^{(i)}_{t} ~= & {} ~ w^{(i)}_{t-1}+w^{(i)}_{\mathrm {constant}~t} + w^{(i)}_{\mathrm {gps}~t} + w^{(i)}_{\mathrm {ldr}~t} \nonumber \\ \quad\quad~\propto & {} ~ w^{(i)}_{t-1}+ w^{(i)}_{\mathrm {gps}~t} + w^{(i)}_{\mathrm {ldr}~t}, \end{aligned}$$
(13)

where

$$\begin{aligned} {w}^{(i)}_{\mathrm {gps}~t} ~= & {} ~ \log e^{\frac{-{(x^{(i)}_{t}-{z}_{\mathrm {gps}~t})}^{2}}{2\sigma ^{2}_{\mathrm {gps}}}},\end{aligned}$$
(14)
$$\begin{aligned} {w}^{(i)}_{\mathrm {ldr}~t} ~= & {} ~ \log \frac{1}{N}\sum ^{N}_{n=1} e^{\frac{-{(z^{(i)}_{\mathrm {ldr}~n,t}-\acute{z}^{(i)}_{\mathrm {ldr}~n,t})}^{2}}{2\sigma ^{2}_{\mathrm {ldr}}}}, \end{aligned}$$
(15)

and

$$\begin{aligned} {w}^{(i)}_{\mathrm {constant}~t} ~=~ \log \frac{1}{\sqrt{2\pi \sigma ^{2}_{\mathrm {ldr}}}}+ \log \frac{1}{\sqrt{2\pi \sigma ^{2}_{\mathrm {gps}}}}. \end{aligned}$$
(16)

Constant \(\hat{w}^{(i)}_{\mathrm {constant}~t}\) is omitted from Eq. (13) because importance \(\hat{w}^{(i)}_{t}\) is marginal [23]. For the latter, FS 2.0 based on GPS and LIDAR, the likelihood weight function is derived in the same way as FS 1.0:

$$\begin{aligned} w^{(i)}_{t} ~= & {} ~ w^{(i)}_{t-1} \frac{1}{\sqrt{2\pi \sigma ^{2}_{\mathrm {ldr}}}} \frac{1}{N}\sum ^{N}_{n=1}e^{\frac{-{(\hat{z}^{(i)}_{\mathrm {ldr}~n,t}-\acute{z}^{(i)}_{\mathrm {ldr}~n,t})}^{2}}{2\sigma ^{2}_{\mathrm {ldr}}}} \nonumber \\&\times\frac{1}{\sqrt{2\pi \sigma ^{2}_{\mathrm {gps}}}} e^{\frac{-{(\hat{x}^{(i)}_{t}-{z}_{\mathrm {gps}~t})^{2}}}{2\sigma ^{2}_{\mathrm {gps}}}}, \end{aligned}$$
(17)

\(\hat{x}^{(i)}_{t}=\mathrm {argmax}_{\mathrm {x}}p(\hat{x}^{(i)}_{t} \mid m^{(i)}_{t-1},z_{t},x^{(i)}_{t})\) is the pose reported by scan-matching using the so-called beam end-point [1]. \(\hat{z}^{(i)}_{\mathrm {ldr}~n,t}\) is the LIDAR measurement from pose \(\hat{x}^{(i)}_{t}\) for a LIDAR’s laser beam and \(\acute{z}^{(i)}_{\mathrm {ldr}~n,t}\) is the true measurement on map \((x,y)^{T}.\) N is the total number of LIDAR laser beams and n is the identification number for a laser beam. The log-likelihood importance weight thus becomes:

$$\begin{aligned} \hat{w}^{(i)}_{t} ~\propto ~ w^{(i)}_{t-1} + \hat{w}^{(i)}_{\mathrm {gps}~t} + \hat{w}^{(i)}_{\mathrm {ldr}~t}, \end{aligned}$$
(18)

where

$$\begin{aligned} {w}^{(i)}_{\mathrm {gps}~t} ~= & {} ~ \log e^{\frac{-{(\hat{x}^{(i)}_{t}-{z}_{\mathrm {gps}~t})}^{2}}{2\sigma ^{2}_{\mathrm {gps}}}}.\end{aligned}$$
(19)
$$\begin{aligned} {w}^{(i)}_{\mathrm {ldr}~t} ~= & {} ~ \log \frac{1}{N}\sum ^{N}_{n=1}e^{\frac{-{(\hat{z}^{(i)}_{\mathrm {ldr}~n,t}-\acute{z}^{(i)}_{\mathrm {ldr}~n,t})}^{2}}{2\sigma ^{2}_{\mathrm {ldr}}}}. \end{aligned}$$
(20)

Evaluation

Our method was evaluated in a closed petrochemical complex in Japan (shown in Fig. 3), which was characterized by typical attributes of Japanese petrochemical complexes. The size of the environment was 550 m in width and 380 m in length. There were two areas: (1) management and oil refinery buildings located near a road (Area 1), and (2) large oil tanks sparsely distributed for safety reasons (Area 2). An electric vehicle (EV) was used to collect sensor data because firefighter robots have a car-like mechanism and the EV has the same mechanism and size. During data collection, the EV was manually driven at a speed of 5 km/h for safety concerns. The EV was equipped with an RTK-Fix-GPS receiver with an error of 0.02 m, a long-range 3D LIDAR with an error of 0.02 m, an inertial measurement unit (IMU) with an error of 0.1 deg/s, and an odometer equipped with a rotary encoder having an error of 0.1 m/s.

The sensor data were recorded during data collection. The vehicle was stopped at each cross-junction until the GPS measurement acquired an RTK-Fix-GPS for additional RTK-Fix-GPS data. The junctions had wide satellite visibility and were thus suitable for acquiring RTK-Fix-GPS measurements. During the whole experiment, the RTK-Fix-GPS availability was 56% as shown in Fig. 4.

For validation, we compared (a) FS 2.0 based on GPS and LIDAR, which is proposed in this paper, (b) FS 2.0 based on LIDAR [1], (c) FS 1.0 in a grid map (FS 1.0) based on GPS and LIDAR, also proposed in this paper, (d) FS 1.0 based on LIDAR, (e) FS 1.0 based on GPS, and (f) Karto SLAM based on LIDAR, which is a kind of open-source graph-based SLAM in a grid map [25]. We evaluated (1) global accuracy of the maps based on aerial map data and (2) local accuracy of the maps based on FARO Focus 3D data for different SLAMs. Map consistency was also evaluated by comparing the maps with aerial images.

These SLAM methods used open source libraries provided by OpenSLAM and Robot Operating System (ROS).

For Fast-SLAM, we modified the SLAM Gmapping library. The authors of Gmapping are Giorgio Grisetti, Cyrill Stachniss, and Wolfram Burgard. It is distributed by OpenSLAM, and the ROS wrapper is provided by the ROS community. We modified the weight of importance based on Eqs. 13 and 18 for FS 1.0 and FS 2.0 based on GPS and LIDAR, respectively. In addition, for FS 1.0 based on GPS and LIDAR, we disabled scan matching for the motion model based on scan matching [26].

The parameters used for Fast-SLAM are explained here. A grid map resolution of 0.2 m and 50 particles were used because they comprise the maximum values allowed by our computer for this environment size. The adaptive resampling threshold was set to a value of 0.01 because we desired a higher resampling rate when GPS value was available. We strived to use GPS as a closed loop; that is, when we received sufficient GPS data, we desired to have a closed loop by resampling. The beam end-point model two-dimensional scan matcher was used because it produced better results than ICP in unstructured outdoor environments [23]. The computation time for our implementation was the same as that of Grisetti et al. [1] because the additional weight calculation was very short.

For LIDAR, the sigma parameter value, \(\sigma _{\mathrm {ldr}},\) we use is 0.2 m. To fuse GPS with LIDAR data, we evaluated several combination parameters of GPS sigma, \(\sigma _{\mathrm {gps}},\) with the LIDAR sigma of 0.2 m as shown in Fig. 5. The result shows that, when the GPS sigma, \(\sigma _{\mathrm {gps}},\) is 0.5 m and larger; the Fast-SLAM path follows the RTK-GPS data, while \(\sigma _{\mathrm {gps}}\) is less than 0.5 m; the Fast-SLAM path have a larger position difference with the RTK-GPS. Therefore, we need to use \(\sigma _{\mathrm {gps}}\) equal to 0.5 m or larger for the Fast-SLAM path to follow the RTK-GPS data.

Fig. 3
figure 3

Petrochemical complex environment. Area 1 (blue) features management and oil refinement buildings located near a road; Area 2 (red) shows an area of large oil tanks positioned sparsely for safety reasons

We have chosen 2 m as the \(\sigma _{\mathrm {gps}}\) value to compensate for the RTK-Fix-GPS multipath error when moving near the building area. Lee et. al. have evaluated the RTK-Fix-GPS horizontal multipath error, which is 1.069 m [27]. To prevent overconfidence on the RTK-Fix-GPS data, we carefully fused GPS data setting a value of 2 m for \(\sigma _{\mathrm {gps}}.\)

For Graph SLAM, we used Karto SLAM [28]. We used a default parameter because it produced the best results when compared with several parameters.

For 3D mapping, a 3D occupancy map stored the 3D point cloud data to erase moving trucks and people from the 3D map [29]. The 3D map was built based on the final trajectory of SLAM.

Fig. 4
figure 4

Petrochemical complex environment. The red path is the robot trajectory; the green path is the RTK-Fix-GPS data

Fig. 5
figure 5

Fast-SLAM trajectory result based on RTK-GPS sigma values, which are 0.2 (dark red), 0.5 (red), 1.0 (brown), 1.5 (green), 2.0 (black), 2.5 (dark blue), and 3.0 (blue). RTK-GPS measurement (cyan)

Result

Global map consistency

Fig. 6
figure 6

Global consistency evaluation using an aerial map as a reference. Each building position in the 2D map (yellow) was compared with building the 2D position in the aerial map

Aerial map building position data were used as the global positioning reference. For the evaluation, a robot 2D map and the aerial map were overlapped based on their respective GPS coordinates. Figure 6 shows the visual result. Figure 7 shows the numerical result. FS 2.0 and FS 1.0 based on GPS and LIDAR showed small position errors of 0.65 and 0.81 m, respectively.

Using only the LIDAR sensor, FS 2.0 and FS 1.0 based on LIDAR showed large position errors of 6.82 and 7.77 m, respectively. Karto SLAM based on LIDAR also showed a large position error of 6.44 m. Using only the GPS sensor, FS 1.0 based on GPS had a significantly large position error of 1.36 m caused by an incorrect assessment of GPS measurements near the building area. Therefore, the result showed that both FS 1.0 and FS 2.0 based on GPS and LIDAR had a high global accuracy. In particular, FS 2.0 based on GPS and LIDAR had the highest global accuracy. In addition, mapping consistency was required, including both the correct building position and correct building size. Accordingly, the local error was evaluated.

Fig. 7
figure 7

Global consistency result

Local map consistency

Figures 9 and 10 show the local accuracy of the map by evaluating the local building size. Faro Focus 3D data with high accuracy was used as the building size reference, as shown in Fig. 8, while a Fig. 3 shows the locations of Area 1 and Area 2.

FS 1.0 and FS 2.0 based on GPS and LIDAR had a high local accuracy in both Areas 1 and 2. For FS 2.0 based on GPS and LIDAR, the error in Area 1 was 0.17 m (Fig. 9), and the error in Area 2 was 0.08 m (Fig. 10). For FS 1.0 based on GPS and LIDAR, the error showed a small increase compared to that of FS 2.0: in Area 1 it was 0.25 m (Fig. 9); in Area 2 it was 0.26 m (Fig. 10). For both methods, no building shape duplication is visually evident in Figs. 9 and 10.

In addition, FS 2.0 based on LIDAR showed a high accuracy in Area 1 and a low accuracy in Area 2. On the other hand, FS 1.0 based on LIDAR had a high accuracy in Area 2 and a low accuracy in Area 1. FS 2.0 based on LIDAR showed a high accuracy of 0.39 m in Area 1 (Fig. 9) and very low local accuracy in Area 2 of 3.04 m (Fig. 10). FS 1.0 based on LIDAR had a very low accuracy of 2.85 m occurring in Area 1 because of a failed close-loop (Fig. 9) and a high local accuracy of 0.61 m in Area 2 (Fig. 10). The local error is visually apparent from obviously duplicated buildings for FS 2.0 based on LIDAR in Fig. 10 and for FS 1.0 based on LIDAR in Fig. 9.

Furthermore, FS 1.0 based on GPS had a low local accuracy in Areas 1 and 2. The error in Area 1 was 0.31 m (Fig. 9) and that in Area 2 was 0.57 m (Fig. 10). Duplicates of building shapes are observed in Figs. 9 and 10. The graph SLAM based on LIDAR in a grid map (Karto SLAM) [25] had a high accuracy in Area 2, but a low accuracy in Area 1, which was caused by a failed closed loop. The low accuracy in Area 1 was 1.18 m (Fig. 9) and the high accuracy in Area 2 was 0.18 m (Fig. 10). Duplicates of building shapes are evident in (Fig. 10).

Thus, the results showed that both FS 1.0 and FS 2.0 based on GPS and LIDAR had local accuracy in both areas.

Fig. 8
figure 8

Faro Focus 3D map measured for a locally consistent reference. a Faro Focus top view. b Example of the mapping result from the RBPF approach

Fig. 9
figure 9

Result of local consistency in Area 1

Fig. 10
figure 10

Result of local consistency in Area 2

Visual map consistency with an aerial image

Figure 11 result shows a 3D map built by FS 2.0 based on GPS and LIDAR from a side view. This was the best 3D map result. Figures 12 and 13 show the visual consistency between 3D maps and an aerial map [2] from Google, which was the reference. FS 2.0 based on GPS and LIDAR provided correct building shape consistency and correct building positions (Fig. 12). FS 1.0 based on GPS and LIDAR had correct building shape consistency and correct building positions (Fig. 13).

Fig. 11
figure 11

3D map built from FS 2.0 based on GPS and LIDAR from a side view

Fig. 12
figure 12

3D map built from FS 2.0 based on GPS and LIDAR (consistency with the aerial map): 3D point cloud with correct building shapes and 3D point cloud positions aligned with the aerial map

Fig. 13
figure 13

3D map built from FS 1.0 based on GPS and LIDAR (consistency with the aerial map): 3D point cloud with correct building shapes and 3D point cloud positions aligned with the aerial map

Reproducibility by multiple runs

We validated our method reproducibility based on the results of global position variance between ten runs in runs in the petrochemical complexes. To evaluate the reproducibility, we selected four points to cover the entire area on the map because three is the minimum number of points required to evaluate plane geometrical differences, which include rotation, location, and scale. The result shows the global position variance between all runs is ± 0.3 m.

Reproducibility for different environments.

In addition, we intended to confirm SLAM reproducibility in petrochemical complexes with low-RTK-Fix-GPS availability. Because GPS data availability is one of the main factors for map consistency, we selected two additional environments for this purpose. The first environment was Factory A with 20.9% RTK-Fix-GPS availability, as shown in Fig. 14a. The second environment was Factory B, which was similar to a petrochemical complex management area with 29.9% RTK-Fix-GPS availability, as shown in Fig. 15a.

To solve the issue of low RTK-Fix-GPS data availability, we used both RTK-Float-GPS, which has the accuracy of the order of a decimeter, and RTK-Fix-GPS data, which have accuracies to the order of a centimeter [30]. As a result, the RTK-fix-GPS measurement availability increased from 20.9 to 58.1% in Factory A, as shown Fig. 14b, and from 29.9 to 59.7% in Factory B, as shown in Fig. 15b. For the weight of important calculation, we increased GPS \(\sigma _{\mathrm {gps}}\) from 2 to 4 m for the RTK-Float-GPS error. In rare cases, the multipath error can become much higher, such as those in Fig. 16, which shows the RTK-Fix-GPS and RTK-Float-GPS data; the red arrow points to a multipath error. Figure 17 shows the enlarged image of the multipath error observed in Fig. 16. To prevent overconfidence on the RTK-Float-GPS data, we use the setting value of 4 m for \(\sigma _{\mathrm {gps}}\).

Fig. 14
figure 14

Red denotes the robot trajectory and green is the GPS data in a Factory A. a GPS availability of 20.9%: RTK-Fix-GPS; b GPS availability of 58.1%: RTK-Fix-GPS and RTK-Float-GPS

Fig. 15
figure 15

Red path denotes the robot trajectory and green is the GPS data in a Factory B. a GPS availability of 29.9%: RTK-Fix-GPS; b GPS availability of 59.7%: RTK-Fix-GPS and RTK-Float-GPS

Fig. 16
figure 16

RTK-Fix-GPS and RTK-Float-GPS data (green) and estimated RTK-Fix-GPS and RTK-Float-GPS trajectory (black line) with multipath error (red arrow)

Fig. 17
figure 17

Multipath error enlarged from Fig. 16. RTK-Float-GPS data (green), estimated RTK-Float-GPS trajectory (black line), and RTK-Float-GPS measurement error of 7.26 m caused by the multipath error (red line)

For evaluation, we used mobile mapping system (MMS) data with an accuracy of 0.1 m [31]. Environments of Factories A and B were not restricted; therefore, we could measure MMS data. We evaluated the global accuracy and visual consistency of the maps on the basis of commercial point-cloud data, as shown in Fig. 18a. For the evaluation, the GPS x–y position was used to overlap the SLAM-constructed map with the MMS map, as shown in Fig. 18c. We used the Gauss-Kruger projection method calculation, which was also used by MMS to determine x–y coordinates based on GPS latitude and longitude. The corresponding accuracy was in the order of micrometers [32]. We evaluated the map consistency based on 14 buildings in Factory A and 22 buildings in Factory B (Table 1).

Fig. 18
figure 18

Data in petrochemical complexes: a red denotes MMS point-cloud data; b 2D map by proposed method; c using GPS position, MMS data overlap our 2D map

Table 1 Global accuracy

Reproducibility of global accuracy for FS 2.0 based on GPS and LIDAR in Factories A and B were 0.52 and 0.72 m, respectively. For FS 1.0 based on GPS and LIDAR, the reproducibility in Factories A and B were 0.70 and 0.77 m, respectively. On other hand, for other me thods of FS 2.0 based on LIDAR, FS 1.0 based on LIDAR, and Karto SLAM the global errors were larger than 16.23 m.

Reproducibility of visual consistency for FS 2.0 based on GPS and LIDAR and FS 1.0 based on GPS and LIDAR in both cases are visually consistent because no distortions are obviously visible between MMS data with map, as shown in Fig. 19a, b for Factory A, and Fig. 20a, b for Factory B. However, visual results for Karto SLAM are not consistent because a large distortion is observed n in Factory A in Fig. 19c and in Factory B, as shown in Fig. 20c.

Fig. 19
figure 19

Data in petrochemical complexes where red is ground truth MMS data and the result of SLAM: a FS 2.0 based on GPS and LIDAR map and b FS 1.0 based on GPS and LIDAR map are visually shown to have consistency. On the other hand, c Karto SLAM based on LIDAR map has no consistency because large distortions are evident on the map

Fig. 20
figure 20

Data in a factory where red is ground truth MMS data and the result of SLAM: a FS 2.0 based on GPS and LIDAR map and b FS 1.0 based on GPS and LIDAR map are observed to have consistency. However, c Karto SLAM based on LIDAR map has no consistency because large distortions are apparent on the map

Discussion

The proposed approach employs two SLAMs based on GPS and LIDAR and the RBPF weight of importance for sensor fusion. Our results showed that the robot-obtained map corresponded well with a geo-referenced map. This is very important for firefighters because they send target positions to robots using robot obtained maps, and they can understand the robot obtained maps by referencing real maps.

In our study, both Fast-SLAM successfully aligned positions with GPS geo-referenced data and produced locally consistent maps. FS 2.0 based on GPS and LIDAR showed the best consistency with real maps, as evidenced by comparing results with those of FS 1.0 based on GPS and LIDAR, FS 2.0 and FS 1.0 based on LIDAR, FS 1.0 based on GPS, and Karto SLAM based on LIDAR. FS 2.0 based on GPS and LIDAR had accurate alignment with the geo-referenced positions: the trajectory error was 0.65 m based on the aerial map data, as shown in Fig. 7. The map distortion was 0.17 m near buildings and 0.08 m near the area with oil tanks (Figs. 9 and 10).

Figure 12 shows that FS 2.0 based on GPS and LIDAR is consistent with an aerial map. FS 1.0 based on GPS and LIDAR had a small increase of alignment error to GPS compared to FS 2.0. The trajectory error was 0.81 m based on the aerial map data (Fig. 7). The map distortion was 0.25 m near buildings and 0.026 m near the area with oil tanks (Figs. 9 and 10). Figure 13 shows that FS 1.0 based on GPS and LIDAR is also consistent with the aerial map.

Our method could reproduce the map variance of 0.3 m for 10 runs. Furthermore, we could reproduce map consistency for our method in two low-RTK-Fix-GPS environments by using both RTK-Fix-GPS and RTK-Float-GPS. The visual cue and global accuracy were used to show map consistency.

FS 2.0 based on GPS and LIDAR are visually consistent, as shown in Fig. 19a in Factory A and Fig. 20a in Factory B. The global accuracy is 0.52 and 0.72 m in Factory A and B, respectively. FS 1.0 based on GPS and LIDAR are also visually consistent, as shown in Fig. 19b for Factory A, and Fig. 20b for Factory B. The global accuracy error, specifically, 0.18 m in Factory A and 0.05 m in Factory B, increased compared to that of FS 2.0 based on GPS and LIDAR.

The limitation of this method pertains to roads with no ground level differences. This method does not work well in environments with large ground level differences. However, this method satisfies the firefighter robot requirement in petrochemical complexes where the roads have no ground differences.

In this work, we selected RTK-Fix-GPS and RTK-Float-GPS. In addition to filter good measurements, we filtered GPS data with the horizontal dilution of precision (HDOP) \(>1.2\) to use only accurate data. However, we still observed incorrect RTK-Fix-GPS and RTK-Float-GPS when moving near building. These errors caused GPS multipath signal errors. To prevent over overconfidence on RTK-GPS data, we carefully fused GPS data by using large GPS sigma \(\sigma _{\mathrm {gps}}.\)

Conclusion

In this study, we extended two RBPFs based on GPS and LIDAR for consistent map construction in petrochemical complexes. FS 2.0 and FS 1.0, both in a grid map, were used for SLAM. We herein proposed a weight function that fuses GPS and LIDAR data inside the RBPFs. An importance weight function is derived to achieve sensor fusion. The weight function enables maintenance of the respective advantages of both GPS and LIDAR sensors. Therefore, these RBPFs based on GPS and LIDAR not only have local consistency, but also global consistency. An experiment was conducted in a closed petrochemical complex in Japan (550 m × 380 m). RTK-GPS availability was 56% in the petrochemical complex. Our results showed FS 2.0 had the best result with a significant improvement of alignment to geo-referenced positions. The mean global error was 0.65 m. A significant result for mapping consistency was 0.17 m near buildings and 0.08 m near sparsely placed oil tanks. Results also showed that both maps had consistency with an aerial map [2]. Our method could reproduce map consistency results for ten runs with a variance of ± 0.3 m. Our method reproduced map consistency results in low RTK-Fix-GPS environment, which was a factory with a building layout similar to petrochemical complexes with 20.9% of RTK-Fix-GPS data availability. The best global accuracy achieved was 0.52 m (Additional file 1).