1 Introduction

Digital watermarking has been used to protect the vector geographic data for more than ten years. Many watermarking algorithms have been proposed for vector geographic data [4, 9,10,11,12, 18, 19, 21, 22, 24, 25, 29]. With the rapid development of information technologies, more approaches are required for copyright protection. For example, data producers may update vector geographic data, and each version is marked with an identifier in the upgrade process. The data may be distributed to multiple parties in confidential environments, and every party marks the data to confirm authorization. Therefore, it is necessary to embed multiple watermarks in the data. Due to the data size of cover data, the number of embedding watermarks is restricted. Meanwhile, cropping is a typical operation for vector geographic data. One or several watermarks in the data may be removed after a cropping attack. However, because of the requirement of multiple authorizations, any damage to any watermark is undesired. This means that all watermarks should be preserved after cropping the vector geographic data.

However, current multiple watermarking algorithms commonly focus on multimedia as described in the [1, 2, 14,15,16, 20, 27, 28]. Multiple watermarking algorithm for vector geographic data are rarely investigated. To the best of our knowledge, only a few previous works have been proposed on this specific topic: for instance, in [23], two watermarks are combined into one composite watermark, and the new watermark is then embedded in the cover data using the DFT algorithm. The two watermarks must be provided before embedding and affect each other. In [13], the coordinates are recorded when the first watermark is embedded. According to the record, the other watermarks can be embedded again. In the watermark detection procedure, the original data and the record are required, and this process is not blind. In [3], four multiple watermarking algorithms are designed for GIS vector data. The first algorithm embeds two watermarks in the DCT domain and DFT domain. The second is based on frequency division, in which the watermarks are embedded in low, intermediate, or high frequency respectively. The third algorithm divides the data into different blocks to embed the watermarks, drawing on the algorithms of other scholars. In addition, the fourth is a combination of single watermarking and zero-watermarking algorithms. The watermark capacity is limited and is not robust to cropping attacks. In [31], the first watermark is embedded in the spatial domain, and the second watermark is embedded in the DFT domain. The embedding order cannot be changed. Cui proposed three methods in his doctorate dissertation [7]. The first is to embed the watermarks in both the X and Y coordinates. Thus, just two watermarks can be embedded in the cover data. The second is to combine all watermarks into one before embedding. These two methods require as a precondition that all watermarks are provided. To overcome these restrictions, the third method is to divide the cover data into blocks according to a quad-tree algorithm and to embed watermarks in different blocks. However, the algorithm in [7] has a limited applicability because it is weakness against cropping attacks. In [26], several watermarks are embedded in the same coordinates one by one by using additive embedding rule. In addition, the original data are required in detection process.

To preserve all watermarks in the watermarked data after cropping, we proposed a multiple watermarking algorithm to defend against cropping attacks for vector geographic data. The mapping relationships between the vertex coordinates, the logic domains, and the watermark bit indexes are first established. Then, the logic domains are subdivided into blocks to embed multiple watermarks. Since the mapping relationships are built before subdividing the blocks, the embedded watermarks are difficult to remove in a cropping attack. For the mapping and subdividing methods, high capacity is achieved as well.

The remaining sections are organized as follows. Section 2 presents the multiple digital watermarking algorithm. Section 3 provides the experimental results of the algorithm. The conclusions are summarized in Section 4.

2 The proposed multiple watermarking algorithm

In a multiple watermarking algorithm, the watermarks can be embedded into the cover data step by step or simultaneously. The approach that embedding the watermarks step by step can be used for multi-user tracking and is more flexible than embedding the watermarks simultaneously. Therefore, we proposed a multiple watermark algorithm that embeds the watermarks step by step. Figure 1 shows the whole procedure of the algorithm.

Fig. 1
figure 1

The proposed multiple watermarking algorithm

2.1 Watermark generation

To improve the multiple watermark capacity and the watermark detection reliability, we use pseudorandom binary sequence as the watermarks. The pseudorandom sequence builder is used to generate watermarks. In the watermark generation, the watermark seed is generated, usually a random integer, and then, the pseudorandom binary sequence to be used as a watermark is generated based on the watermark seed. The watermarks are different due to the use of different watermark seeds. A watermark is W j  = {w j [i], 0 ≤ i < L}, where 0 ≤ j < N, N is the number of watermarks, i is the watermark index, L is the length of the watermark, and w j [i] is the ith watermark bit of the jth watermark. Additionally, w j [i] ∈ {−1, 1},P(w j [i] =  − 1) = 1/2, and P j (w[i] = 1) = 1/2,which means that w j [i] =  − 1and w j [i] = 1 have the same probability, which is equal to 1/2. Figure 2 shows an example of the relationship between the watermark seed, the watermark, the watermark bits, and the watermark bit indexes.

Fig. 2
figure 2

The relationship between the watermark seed, the watermark, the watermark bits, and the watermark bit indexes

2.2 Watermark embedding

2.2.1 Coordinate mapping and domain subdivision

The vector geographic data consist of the vertex coordinates. Vertex coordinates are the fundament units of points, polylines, and polygons that describe geographical objects, such as wells, rivers, and residential areas. Generally, digital watermarking is categorized into spatial domain [4, 10, 11] and frequency domain algorithms [12, 21]. Vertex coordinates provide the space for embedding watermark in two kinds of watermarking algorithms. The difference between these algorithms is that the watermark is embedded directly, either modifying the vertex coordinates or not. In addition, the spatial domain algorithm has good robustness against common attacks, such as vertex addition, vertex deletion, simplification, and cropping attacks. Additionally, the spatial domain algorithm based on vertex coordinate mapping is a good method for resisting cropping attack [30].

The range of vertex coordinates in the map changes with different maps. It is difficult to directly establish a stable relationship between vertex coordinates and watermark bit index. Therefore, the logic domain is defined artificially, which means fixed domain where the vertex coordinates are mapped to. And the logic domain is used as mediation to establish the mapping relationships between vertex coordinates and watermark bit index. For embedding multiple watermarks in a 2D vector geographic map, the logic domains are subdivided into blocks to embed different watermarks. Because the watermarks are embedded one by one, we use the dichotomy method to divide the logic domains for embedding watermarks as more as possible. The specific process is depicted as follows:

  1. 1.

    Let the vertex coordinates of the vector geographic data be the setVC, and VC = {vc i | (x i , y i ), 0 ≤ i < M}, where M is the number of vertex coordinates, and (x i , y i )is the coordinate of the ith vertex. Let the logic domains be the set LD, and LD = {ld i , 0 ≤ i < N}. According to the following method, the mapping relationship is established. Figure 3 presents the process of coordinate mapping.

Fig. 3
figure 3

The diagram of vertex coordinate mapping

  • First, VC is mapped into the region with the size of R_x × R_y, where R_x <  < OM_x, and R_y <  < OM_y. OM_x and OM_y are respectively the width and height of the bounding rectangle for the map.

  • Second, the region is divided into NR × NC logic domains, ld i , according to Eq. (1), where NR × NC = L.

$$ \left\{\begin{array}{c} NR=R\_y/ LD\_y\\ {} NC=R\_x/ LD\_x\end{array}\right. $$
(1)

R_x,R_y,LD_x, and LD_y are all positive integers. Additionally, R_x can divide LD_x, and R_y can divide LD_y.

  • Finally, let kr_iand kc_i be the logic domain index where the coordinate (x i , y i ) is mapped to, and

$$ \left\{\begin{array}{c} kr\_i=\left\lfloor \left(\left\lfloor {y}_i\cdot sc\right\rfloor \%R\_y\right)/ LD\_y\right\rfloor \\ {} kc\_i=\left\lfloor \left(\left\lfloor {x}_i\cdot sc\right\rfloor \%R\_x\right)/ LD\_x\right\rfloor \end{array}\right. $$
(2)

where sc is a parameter controlling the distortions to the cover data after embedding the watermark. % represents complementation. ⌊a⌋ is the greatest integer that is smaller than a. Then, the mapping relationship between the logic domain in the kr_ith row and the kc_ith column and the watermark bit index, k, is

$$ k= kr\_i\cdot NR+ kc\_i $$
(3)
  1. 2.

    According to the number of watermarks, N, every logic domain, ld i , is subdivided intoNblocks in a step by step process, which is shown in Fig. 4. Every block is embedded in the corresponding watermark, ld i_j (w j [i]), where 0 < j ≤ N, 0 ≤ i < L. In the process of watermark embedding, we use a quantization scheme so that the watermark currently being embedded will replace the existing watermark.

Fig. 4
figure 4

The process of logic domain subdivision

2.2.2 Embedding multiple watermarks

In the embedding process, the same watermark bit is repeatedly embedded into the coordinates that are in the block corresponding to a given index. Additionally, correspondent watermarks are successively embedded in the blocks in every logic domain. These watermarks purportedly provide robustness against cropping attacks. For instance, in Fig. 5, if the coordinates in the gray rectangles are cropped, the W 4 will not be removed.

Fig. 5
figure 5

The relationships among the coordinates, logic domains, and blocks

The embedding process is described as follows. First, the mapping relationship is established according to the Section 2.2.1. Second, every mapping logic domain is subdivided into N blocks according to the number of watermarks. Finally, a quantization embedding algorithm is adopted in the program for embedding every watermark bit [5]. The Fig. 6 describes the embedding process for the watermark.

Fig. 6
figure 6

The flow chart of the watermark, W j , embedding process based on quantization. qris the quantization step, which is an even number

2.3 Watermark extraction and detection

Usually, the number of watermarks in the detected data is uncertain, so heuristic detection is adopted to detect potential multiple watermarks from the detected data in this paper. During watermark extraction and detection, these steps are followed.

  1. 1.

    Let u, Ube the counts for statistics. u = 1, U = 64. Let the extracted watermarks be W , and W  = ∅.

  2. 2.

    Suppose there are uwatermarks in the detected data. According to u, the subdividing blocks of every logic domain, ld i , are calculated.

  3. 3.

    Extract the uwatermarks from the detected data according to the above logical domain subdivision. Figure 7 shows the process of extracting every watermark.

  4. 4.

    Detect the watermarks by calculating correlated detection between the extracted watermark and original watermark as shown in Eq. (5). Let the detected watermarks be W , and let W  = W  ∪ W .

  5. 5.

    Then, let u = u + 1. If u ≤ U, repeat the previous steps beginning with step 2. If u > U, the watermark extraction and detection processes are finished, and W contain all the watermarks detected from the detected data.

Fig. 7
figure 7

The flow chart of the \( {W}_j^{\prime } \) extraction process

u is the supposed number of watermarks in current watermark detection, and U is the maximum probable number of watermarks. U = 64 in the experiments in Section 3.

After watermark extraction, the watermark is detected based on correlated detection. Let cor be the correlation coefficient between W j and \( {W}_j^{\prime } \), which is shown in Eq. (4).

$$ cor=\frac{\sum \limits_{i=0}^{L-1}{w}_j\left[\mathrm{i}\right]\ast {w}_j^{\prime}\left[\mathrm{i}\right]}{L} $$
(4)

If there is no watermark in the detected data, the extracted watermark bit \( {w}_j^{\prime}\left[\mathrm{i}\right] \) satisfies \( P\left({w}_j^{\prime}\left[\mathrm{i}\right]=1\right)=1/2 \) and \( P\left({w}_j^{\prime}\left[\mathrm{i}\right]=-1\right)=1/2 \). Therefore, the correlation coefficient, cor, has a distribution that can be approximated by a normal distribution, as shown in Eq. (5).

$$ cor\sim N\left(0,\frac{1}{L}\right) $$
(5)

In the watermark detection, we can calculate the detection threshold by controlling the false positive rate (FPR). A 4 σ principle was used to calculate watermark detection threshold in this paper to ensure that FPR is less then10−4.

If the cor is bigger than the detection threshold, W j is present in the detected data.

3 Experimental results and analysis

The experiments were performed on different digital vector geographic maps in shapefile format, which were provided by GeoMarking Company [17]. The maps are organized as points, polylines, and polygons. Figure 8 shows three examples of these maps.

Fig. 8
figure 8

Examples of the experimental data. a Points in vector geographic data. b Polylines in vector geographic data. c Polygons in vector geographic data

As a robust watermarking algorithm, the robustness of the proposed algorithm was first analyzed with respect to the common attacks in the first experimental stage. Following the robustness test, cropping attacks on the multiple watermarking algorithm were analyzed. In the second experimental stage the multiple watermark capacity of the algorithm was demonstrated.

3.1 Algorithm robustness

The algorithm robustness for vector geographic data refers to the ability to detect the watermark after common operations [6], such as data simplification, randomly adding or deleting vertices, deleting features (for example, a polyline in the map), and cropping.

In the experimental procedure, the four different watermarks were first embedded in the experimental map as shown in the Fig. 8(b). Then, the aforementioned operations were carried out on the watermarked maps. Finally, the watermarked maps after the attacks were assessed for whether the watermarks were present. The results are shown in Table 1. √ indicates that there is a corresponding watermark in the data after the attack, and × indicates the opposite result. The same experiments are made by using the other 29 experimental maps, and the experimental results are in accordance with the Table 1.

Table 1 Robustness experimental results for the Fig. 8(b)

In the test case, the Douglas-Peucker method [8] was adopted to simplify the watermarked maps, and the simplification percentage is the ratio of the simplified data to the original data. The simplification attack is shown in Fig. 9. A vertex addition attack is where vertices are added to the watermarked maps randomly; vertex deletion involves deleting vertices from the watermarked maps randomly; a feature deletion attack denotes deleting features from watermarked data randomly (as shown in Fig. 10); and a data cropping attack is where regions of the watermarked data are cropped, and the cropping percentage refers to the ratio of the cropped data to the original data.

Fig. 9
figure 9

Simplification attack. a Watermarked map (part). b The simplified watermarked map (part)

Fig. 10
figure 10

Feature deletion attack. a Watermarked map. b The watermarked map after feature deletion

Furthermore, the vector sketch maps are used as experimental data to test the algorithm robustness, and Fig. 11 shows one example of the experimental maps. The experimental procedures, which are the same as above, are carried out using 10 different vector sketch maps, and the experimental results for Fig. 11 are shown in Table 2. The experimental results for other sketch maps are in accordance with Table 2.

Fig. 11
figure 11

One example of the experimental sketch maps

Table 2 Robustness experimental results for vector sketch map shown in Fig. 11

Based on the experimental results listed in Tables 1 and 2, we can see that the proposed algorithm has good robustness against the attacks, such as data simplification, vertex addition, vertex deletion, feature deletion, and cropping. In addition the algorithm is suitable for different types of vector geographic data.

3.2 The robustness of the algorithm against a cropping attack

The main concept of common multiple watermarking algorithms is embedding the watermarks in non-overlapping regions. As a result, any watermark is easily removed after an unnoticed cropping attack. We demonstrate the robustness against cropping attacks in this section. Four watermarks were embedded in 10 vector geographic maps using the proposed algorithm. Then, the discretionary regions were cropped from the maps. Figure 12 shows four examples of cropping for the experimental vector geographic map. Meanwhile, the same 4 watermarks were embedded in the same 10 experimental maps by using the algorithm in [7]. In addition, the same cropping attacks were carried out on the watermarked maps. The above algorithm is an improved dividing blocks algorithm using a quad-tree method. The experimental results are listed in Table 3. In Table 3, a √ indicates that the watermark in present in the data after the attack, and × indicates the opposite result.

Fig. 12
figure 12

Cropping attacks. a, b, c, and d represent different types of cropping

Table 3 Robustness against cropping attack of different algorithms

From the results shown in Table 3, we can observe that due to establish the logic domains by building the mapping relationship before subdivision, the proposed algorithm preserves the all of the watermarks in the watermarked map after cropping. Since a few watermarks are lost after the cropping attack when using the algorithm in [7], the robustness of this method against a cropping attack on the proposed algorithm is superior to that of the algorithm in [7].

3.3 Multiple watermark capacity

The multiple watermark capacity of multiple watermarking algorithm is the maximum number of watermarks that can be embedded. Generally, more watermarks can be embedded in maps with a greater data size (i.e., a greater number of vertex coordinates in the maps). Therefore, we chose 10 experimental maps that have different data sizes and that are either in point, polyline, or polygon maps.

In the experimental procedure, the watermarks were embedded in 10 vector geographic maps by using the proposed algorithm and the algorithm in [7]. The program did not stop until no additional watermarks could be embedded in the cover maps. Meanwhile, the number of watermarks embedded in these maps was recorded. Then, the detection of multiple watermarks was carried for the testing maps. With respect to practical applications, we set the maximum number of embedded watermarks to be 64 in our experiments. The experimental results are listed in Table 4. A √ indicates that all watermarks can be detected. In the experiments, L = 200.

Table 4 Multiple watermark capacity comparison for the two algorithms when L = 200

For the results listed in Table 4, we can observe that number of embedded watermarks increases when the data size of the maps becomes larger, and all watermarks can be detected from the watermarked maps. The proposed algorithm is superior to the reference [7] algorithm in terms of multiple watermark capacity, primarily because the proposed algorithm uses the dichotomy method to subdivide the embedding region, while the reference [7] algorithm adopts the quad-tree method. However, due to the dichotomy method, the watermark number is not directly proportional to the map data size. For example, Map 3 and Map 4, which have different number of vertices, can be accommodated 4 watermarks. For the subdivision method shown in Fig. 4, the fifth block is 1/8th of the domain, so the coordinates in the fifth block are the 1/8 of 1224, i.e., 153. In addition, 153 is fewer than the length of the watermarks, 200, which means that the fifth watermark can not be embedded in the fifth block. For this reason, Map 4 accommodates only 4 watermarks instead of 6 watermarks. Consequently, the multiple watermark capacity is related to the data size and the length of the watermark. Additionally, the multiple watermark capacity of the proposed algorithm is sufficient for vector geographic data with common data sizes.

To analyze the relationship between L and N, let L = 100, L = 300, and L = 400. Then, the above experimental procedures are repeated, and the experimental results are shown in Tables 5, 6, and 7.

Table 5 Multiple watermark capacity test for the proposed algorithm when L = 100
Table 6 Multiple watermark capacity test for the proposed algorithm when L = 300
Table 7 Multiple watermark capacity test for the proposed algorithm when L = 400

From the results listed in Tables 5, 6, and 7, the multiple watermark capacity is basically proportional to data size, which is accordance with the results in Table 4. According to the results in Tables 4, 5, 6, and 7, multiple watermark capacity is inversely proportional to the length of the watermark. That is, the multiple watermark capacity becomes larger as the length of each watermark becomes shorter.

4 Conclusions

We proposed a multiple watermarking algorithm for vector geographic data using vertex coordinates mapping and logic domain subdivision. The algorithm aims to improve the robustness against common attacks with an emphasis on cropping attacks. To address these issues, we first mapped the vertex coordinates to the logic domains. Then, based on the dichotomy method and the proposed rules, each logic domain was subdivided into blocks according to the number of watermark, and multiple watermarks were embedded into the corresponding blocks one by one. In the experimental validation, the results showed that the proposed algorithm has good performance on the robustness and multiple watermark capacity.

There are two drawbacks to the proposed algorithm. One drawback is that the multiple watermark capacity is decreased as the data size data becomes small, and another drawback is that the proposed algorithm lacks robustness against geometric transformation (such as rotation, scaling and translation). Our intention for future work is to increase the multiple watermark capacity for vector geographic maps with few vertices and to improve the robustness against geometric transformation attacks.