Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Among proposed line detectors, LSD [2] is one of the best and most popular methods. It accurately detects segments and does not use any threshold tuning, relying instead on the a contrario methodology. Though results are very good for small images (up to 1 Mpixel), it tends to give poorer results with high resolution images (5 Mpixels and more). As explained by Grompone von Gioi et al. [2], the detection is different after scaling or croping the picture. For high resolution images, detections are often over-segmented into small bits of segments and some lines are not even detected (see Fig. 1).

These poor results can be traced back to the greedy nature of LSD. Detected segments are in fact rectangular areas that contain a connected cluster of pixels with gradients that are similarly oriented. After they are identified, a score representing a number of false alarms (NFA) validates the detection as an actual segment or not. However, in high resolution cases, edges tend to be less strong which breaks the connectivity between pixel clusters and yields over-segmentation or lack of detection in low-contrast areas.

Fig. 1.
figure 1

Lines detected with LSD [2] (left) or with MLSD [3] (right). The picture has a resolution of 15 Mpixels.

We propose a method that generalizes LSD to any kind of images, without being affected by their resolution. For this, we use a multiscale framework and information from coarser scales to better detect segments at finer scales. In our companion paper [3], we compare it to other state-of-the-art line detectors, namely LSD [2] and EDLines [1], as a building block of a structure from motion (SfM) pipeline [4] to obtain quantitative, objective results.

2 Notation

We use the same notations as the companion paper [3], recalled here. In the following, the k index will denote the scale associated with the feature.

2.1 Upscaled Segment

Given a coarse segment \(s_i^{k-1}\) of direction \(\theta (s_i^{k-1})\) detected with some angular tolerance \(\pi \, p_i^{k-1}\) (\(0\le p_i^{k-1} \le 1\) represents a probability), we define \(\mathcal {A}_i^k\) as the rectangular area of \(s_i^{k-1}\) upscaled in \(I^k\), and \(\mathcal {P}_i^k\) as the subset of pixels in \(\mathcal {A}_i^k\) that have the same direction as \(s_i^{k-1}\) up to \(\pi \, p_i^{k-1}\):

$$\begin{aligned} \mathcal {P}_i^k = \left\{ q \in \mathcal {A}_i^k \text { s.t. } |\theta (q) - \theta (s_i^{k-1})|_{(\text {mod}\;\pi )} < \pi \, p_i^{k-1} \right\} . \end{aligned}$$
(1)

where \(\theta (q)\) is the direction orthogonal to the gradient at pixel q. Note that we only consider a gradient direction if the gradient magnitude is above a given threshold \(\rho =2/\text {sin}(45^\circ /2)\) as in the original LSD because it is a good trade-off between good and fast detections.

2.2 Fusion Score

Given n segments \(S = \{s_1, ..., s_n\}\), let \( Seg (\cup _{i=1}^n s_i)\) be the best segment computed from the union of the clusters \(s_i\), defined as the smallest rectangle that contains the rectangles associated to all segments \(s_i\). The corresponding fusion score of the set of segments is defined as:

$$\begin{aligned} \mathcal {F}(s_1, ..., s_n) = \log \left( \frac{\text {NFA}_\mathcal {M}(s_1, ..., s_n, p)}{\text {NFA}_\mathcal {M}( Seg (\cup _{i=1}^n s_i), p)}\right) . \end{aligned}$$
(2)

The NFA is computed with Eq. (3) of the companion paper [3]:

$$\begin{aligned} \text {NFA}_\mathcal {M}(S, p) = \gamma N_{L} \left( {\begin{array}{c}(NM)^{\frac{5}{2}}\\ n\end{array}}\right) \prod _{i=1}^n (|s_i| + 1) \mathcal {B}(|s_i|, k_{s_i}, p) \end{aligned}$$
(3)

in an \(N\times M\) image, where \(\gamma \) is the number of tested values for the probability p, \(N_L\) the number of possible segments in the image, and \(k_s\) the number of pixels in the rectangle aligned with its direction, with tolerance \(\pi p\). It uses the tail of the binomial law:

$$\begin{aligned} \mathcal {B}(|s|, k_s, p) = \sum _{j=k_s}^{|s|} \left( {\begin{array}{c}|s|\\ j\end{array}}\right) p^j(1-p)^{|s|-j}. \end{aligned}$$
(4)

The fusion score defines a criterion for segment merging that does not rely on any parameter. If positive, the segments \(s_1, ..., s_n\) should be merged into \(\mathcal {S}eg(\cup _{i=1}^n s_i)\) otherwise they should be kept separate.

3 Dense-Gradient Filter

For SfM purpose, a too high density of segment detections in some area, such as a grid pattern, often leads to incorrect results for line matching. The density of similar lines also leads to less accurate calibration because it weighs too much similar information and thus tends to reduce or ignore information from lines located in other parts of the image. To address this issue, we designed a filter that disables detections in regions with too dense gradients. It also allows a faster detection as these regions often generate many tiny aligned segments that would need to be merged during our post-detection merging.

For this, we first detect regions with a local gradient density above a given threshold. The segment detection is then disabled in these areas. The process is fast because we apply it only at the coarsest scale using summed area tables.

With this filter, we may obtain a less exhaustive segment detection in these areas and at their borders. However, it leads to a better matching and a better calibration. It also decreases computation time for images with this type of regions.

4 Implementation

Our implementation is available on GitHub (https://github.com/ySalaun/MLSD).

4.1 Main Algorithm

Our algorithm consists in an iterative loop of three steps for each considered scale of the picture:

  1. 1.

    Multiscale transition: Upscale information from previous, coarser scale and use this information to compute segments at current, finer scale.

  2. 2.

    Detection: Detect segments using the standard LSD algorithm [2].

  3. 3.

    Post-detection merging: Merge neighboring segments at current scale.

Fig. 2.
figure 2

Multiscale Line Segment Detector (MLSD).

Note that step 2 uses the exact same procedure as LSD and thus will not be described in this paper. The dense-gradient filter can optionally be used at the coarsest scale.

The number of considered scales is noted K. Though it can be chosen by the user, we compute it automatically depending on the size of the picture:

$$\begin{aligned} K = \min \{k \in \mathbb {N} \text { s.t. } \max (w, h) \le 2^k s_{max}\}, \end{aligned}$$

where w (resp. h) is the width (resp. height) of the picture. We use a scale step of 2 and chose \(s_{max} = 1000\) as we did not observe over-segmentation for images of size lower than \(1000\times 1000\) pixels and reducing too much the picture size can create artifacts.

The overall algorithm is described in Fig. 2. The successive steps are described below with a pseudo-code giving the main steps and additional details in the text.

4.2 Dense-Gradient Filter

The general idea of the dense-gradient filter is to discard from detection areas in which there is a high density of pixels with strong gradients.

Experimentally, we observed that it is difficult to set a density threshold for a proper filtering. If the density threshold is too low, it tends to discard pixels that may belong to interesting segments. If it is too high, it does not filter out enough pixels. For this reason, we perform the filtering in two steps. First, we identify which pixels are at the center of dense-gradient areas. Second, we disregard a region which is larger than the one that is used to evaluate the density of strong gradients.

Fig. 3.
figure 3

Dense gradient filter.

This procedure is implemented in our code with the function denseGradientFilter and described in Fig. 3. It consists in 3 steps:

  1. 1

    Summed area table: We use a value of 1 for pixels with a gradient above \(\rho \) and 0 otherwise.

  2. 2

    Filtering: For each pixel, we estimate the local density \(\tau \) of pixels within a \(5\times 5\) window. The filtered pixels are those with a density \(\tau > \tau _{ DENSE } = 0.75\).

  3. 3

    Expansion: For each filtered pixels p, we also discard every pixel inside a \(21\times 21\) window centered at p.

4.3 Multiscale Transition

We use a multiscale exploration that propagates detection information at coarse scales to finer scales, which contributes in reducing over-segmentation.

Fig. 4.
figure 4

Multiscale transition.

This procedure is implemented in our code with the function refineRawSegments, described in Fig. 4 and some parts are illustrated in Fig. 5. It iterates over each segment detected at the previous scale:

Fig. 5.
figure 5

Illustration of the multiscale processing steps. The large orange rectangular box represent the upscaled region of the coarser segment. The blue rectangular box at step 4 represents the region of the detected segment at the finer scale. \(\mathcal {P}_i^k\) is represented at step 1 with black pixels. After aggregation we only keep regions with at least 10 pixels. At step 3, we treat each cluster as an LSD segment with a barycenter, width and direction. We then consider the line that goes through its barycenter and with its direction (represented in yellow) to find merge candidates. (Color figure online)

  1. 1

    Upscaling: The segment coordinates are upscaled and we compute the set \(\mathcal {P}_i^k\) of pixels aligned with the segment direction (1).

  2. 2

    Aggregation: Pixels inside \(\mathcal {P}_i^k\) initialize clusters and are aggregated following an 8-neighborhood greedy method.

  3. 3

    Merging: The clusters found at step 2 are merged according to the fusion score (2). As we cannot practically consider all the possible groups of clusters, we sort them by increasing NFA (i.e., decreasing meaningfulness) and queue them. Then, for each cluster c, we find the set of clusters that intersects with the line corresponding to c and try to merge the whole set. If merging is validated by the fusion score, we add the new segment inside the queue and dequeue the merged segments.

  4. 4

    Segment computation: For each resulting cluster, if the NFA is low enough, we compute its corresponding segment and add it to the current set \(\mathcal {S}^k\).

In the case where no pixel is selected at step 1 (\(\mathcal {P}_i^k = \emptyset \)) or no segment is added at step 4, we add the original segment into \(\mathcal {S}^k\) with a scale information. It is kept as is, unrefined until all scales are explored. This allows detecting segments in low-contrast areas. Following LSD [2], we use a threshold \(\epsilon = 1\) for NFA which corresponds to one false detection per image. LSD authors have shown that with values of TODO, the results do not change too much. In the code, this value is represented by the variable logeps.

4.4 Post-detection Merging

As new segments may be detected by LSD at the current scale, in addition to the segments originating from coarser scales, and as these segments may correspond to some form of over-segmentation, we apply another pass of segment merging, similar to the one used in multiscale transition but simplified to consider a reduced number of possible fusions.

Fig. 6.
figure 6

Post-detection merging.

Fig. 7.
figure 7

Lines detected with LSD [2] (middle) and with MLSD [3] (right) in a 15 Mpixels image (left).

Fig. 8.
figure 8

Zoom of the bottom-right corner of pictures in Fig. 7.

Fig. 9.
figure 9

Comparison between MLSD with and without dense-gradient filter.

This procedure is implemented in our code with the function mergeSegments and described in Fig. 6. It iterates on each segment previously detected. For this, as above, we first sort them by increasing NFA and push them to a queue. We then iterate the following steps until the queue is empty:

  1. 1

    Clustering: For each segment \(s_i\), we consider its central line as in multiscale transition (Sect. 4.3). For each of the two directions of the line, we examine the first cluster intersecting the line and such that its direction is similar to the direction of \(s_i\) up to tolerance \(\pi \, p_i\).

  2. 2

    Merging: The previously selected segments are merged if needed using the fusion score as a reference (2).

  3. 3

    Queuing: If a merged segment was created, add the new segment to the queue and dequeue the merged segments.

5 Examples

In this section, we compare MLSD to LSD using images that specifically illustrate the limitations of LSD. Computation time are given for both methods.

5.1 Low-Contrast Images

We first consider an image with low contrast (Fig. 7). As can be seen when zooming the picture (see Fig. 8), the image is also noisy. As the gradient is also noisy around edges, the tile borders are hardly detected at all, whereas MLSD does detect them and does not create much over-segmentation.

Fig. 10.
figure 10

Comparison between LSD and MLSD on the same image with original resolution (above, 5 Mpixels) and four times reduced (below, 330 kpixels).

Fig. 11.
figure 11

Comparison between LSD and MLSD on the same image with original resolution (above, 360 kpixels) and resolution four times larger (below, 5.8 Mpixels).

Fig. 12.
figure 12

Comparison between LSD and MLSD on the same image (top, 18 Mpixels) but with cropped versions (no resolution changes) of respectively 5.6 Mpixels (middle) and 1.5 Mpixels (bottom).

5.2 Dense-Gradient Filter

Figure 9 illustrates the efficiency of the dense-gradient filter. The main difference between the two MLSD results occurs in the central part of the image where there is a grid pattern whereas the other parts are not affected. This type of pattern tends to slow the algorithm a lot (typically by a factor 3). Moreover, as argued above, the added segments are similar to each other and tend to deteriorate matching, and thus SfM results.

5.3 Effect on Scaling and Crop

Figures 10 and 11 illustrate the differences between LSD and MLSD for images with different resolutions. In Fig. 10, we decreased the size of the image (four times in both height and width) and in Fig. 11, we increased the size of the image (four times in both height and width).

Figure 12 illustrates the differences between LSD and MLSD for cropped versions of the same image. We cropped the original picture into a 2 times smaller picture and then in a 4 times smaller picture. Whereas MLSD does not show significant changes from one picture to the other, LSD tends to detect differently segments.

Although in each case the results are different for both algorithms, the changes are limited for MLSD, whereas LSD gives sensibly different results with either a change of resolution or a crop.

6 Conclusion

We presented MLSD, a multiscale extension to the popular Line Segment Detector (LSD). MLSD is less prone to over-segmentation and is more robust to noise and low contrast. Being based on the a contrario theory, it retains the parameterless advantage of LSD, at a moderate additional computation cost. The source code accompanying this paper is not yet at as clean and readable as it could be. We plan to clean it and build an online demonstration with it on the IPOL website (http://www.ipol.im/).