# Size- and Port-Aware Horizontal Node Coordinate Assignment

## Abstract

The approach by Sugiyama et al. is widely used to automatically draw directed graphs. One of its steps is to assign horizontal coordinates to nodes. Brandes and Koepf presented a method that proved to work well in practice. We extend this method to make it possible to draw diagrams with nodes that have considerably different sizes and with edges that have fixed attachment points on a node’s perimeter (ports). Our extensions integrate seamlessly with the original method and preserve the linear execution time.

## Keywords

Node Size Original Algorithm Outgoing Edge Straight Edge Diagram Type## 1 Introduction

The layer-based approach to graph layout as introduced by Sugiyama et al. [6] is a well-established methodology to automatically draw directed graphs in the plane. It is defined as a pipeline of three subsequent phases: *node layering* distributes the nodes into subsequent layers such that edges only point from lower to higher layers; *crossing minimization* orders the nodes in each layer such that the number of edge crossings is minimized; finally x-coordinate assignment (or *node placement*) determines x coordinates for nodes. In practice, an initial *cycle breaking* phase as well as a final *edge routing* phase are often added to support cyclic graphs and non-simple edge routing styles.

In the area of model-driven engineering (MDE), graphical languages are often used to model complex software systems. For instance, tools such as *LabVIEW* (National Instruments), *EHANDBOOK* (ETAS), and *Ptolemy* (UC Berkeley) allow to model systems using *data flow diagrams* and make use of automatic layout algorithms to arrange nodes and edges. In such diagrams edges are usually routed in an orthogonal fashion and connect to nodes through dedicated attachment points on a node’s boundary (so-called *ports*). Also, nodes have considerably different sizes, see Fig. 1 for examples.

All of these characteristics pose challenges for automatic graph drawing algorithms that are rarely addressed by existing solutions. Previous work by Schulze et al. [5] introduced methods that extend the layer-based approach to support the special requirements of data flow diagrams, focusing on crossing minimization and edge routing. In this paper, we focus on node placement.

While we refer to Healy and Nikolov [3] for a general overview of existing node placement approaches, it is worth noting that most of them try to a certain extent to reduce the number of edge bend points. For one thing, the approach introduced by Sander [4] ensures that long edges are always drawn straight, but uses a barycenter-like balanced placement for all other edges. Once a node has more than one outgoing edge, this usually results in two bend points per edge. For another thing, the approach introduced by Brandes and Koepf [1], extending ideas of Buchheim et al. [2], tries to draw as many edges straight as possible.

*Contributions.* Brandes and Koepf assume that all nodes have the same size and do not take ports into account; thus their algorithm straightens at most one outgoing edge per node. In this paper, we extend the approach by Brandes and Koepf to remove these restrictions and take the opportunity to place nodes such that more than one outgoing edge per node can be drawn straight. This leads to drawings as seen in Fig. 1. Throughout the paper we will assume that the node placement algorithm cannot change the size of nodes and the position of ports.

*Outline.*Following the usual conventions, we start by introducing the required terminology in the next section. Sect. 3 then gives an overview of the algorithm by Brandes and Koepf before Sect. 4 introduces our extensions. We evaluate our algorithm in Sect. 5 and close with a conclusion and future work in Sect. 6.

## 2 Preliminaries

Let \(G=(V, P, \pi , E)\) denote a directed graph with ports, where *V* is a set of nodes and *P* a set of ports, i. e. attachment points on a node’s boundary. \(\pi : P \mapsto V\) assigns each port to a node. \(E \subseteq P \times P\) is a set of directed edges connecting the ports.

During the first steps of the layer-based approach cyclic graphs are made acyclic, a *layering* is calculated, and an ordering is determined for each layer. A layering \(\mathcal {L}\) is an ordered partition of *V* into non-empty *layers* \(L_1, \dots , L_{|\mathcal {L}|}\) and \(\mathcal {L}(v) \rightarrow \{1,\dots ,|\mathcal {L}|\}\) maps each node \(v\in V\) to the index of its respective layer. Since all edges must point in the same direction, \(\mathcal {L}(\pi (p)) < \mathcal {L}(\pi (q))\) must hold for all edges \((p,q) \in E\). An edge (*p*, *q*) is *short* if \(\mathcal {L}(\pi (q))-\mathcal {L}(\pi (p)) = 1\); it is *long* otherwise. A layering is *proper* if all edges are short. Note that a layering can be made proper by splitting long edges and introducing *dummy nodes*. We refer to the short edges of a proper layering as *edge segments*. That is, an original edge can be represented by one or more edge segments. Each layer \(L_i \in \mathcal {L}\) is an ordered tuple of nodes \((v^i_1,\dots ,v^i_n)\), where \(n=|L_i|\). The position of a node \(v^i_j\) in layer *i* is \(pos(v^i_j) = j\) and the predecessor of a node \(v^i_j\) with \(j > 1\) is \(pred(v^i_j) = v_{j-1}^i\). This gives a properly layered, directed, acyclic graph with ports (LDAGP) \(G'=(V', P', \pi ', E', \mathcal {L})\). The set of nodes now includes a set of dummy nodes \(\mathcal {D}\) such that \(V'=V\cup \mathcal {D}\). For each dummy node two ports are introduced and edges are added and reconnected accordingly.

Finally, let \(width : V' \mapsto \mathbb {R}\) assign a width to each node. Throughout this paper, we assume that for an edge \((p,q)\in E\), *p* is on the lower boundary of \(\pi (p)\) and *q* is on the upper boundary of \(\pi (q)\) to prevent edges from crossing nodes. Let \(x_p : P' \mapsto \mathbb {R}\) assign positions to ports relative to the leftmost point on their respective boundary.

## 3 The Original Algorithm

In this section we give a brief summary of the original algorithm of Brandes and Koepf. For further details we refer to the paper itself [1]. The basic idea of the algorithm is to traverse a given graph in different directions to calculate four extremal layouts and combine them into a balanced final layout. The algorithm is divided into the following steps: (1) During *Vertical Alignment* nodes are combined into so-called *blocks*. Different directions may result in different blocks. Edges between the nodes in a block will be drawn straight. (2) *Horizontal Compaction* moves the calculated blocks as close to each other as possible and assigns explicit x coordinates to nodes. Depending on the direction nodes are either compacted leftwards or rightwards. (3) *Balancing* combines the four extremal layouts resulting from the previous two steps to a final drawing.

During the alignment step nodes are aligned with their median neighbor in the preceding layer. Consecutively aligned nodes are referred to as a block, see Fig. 2b for an illustration. Let \(\mathcal {B}\) denote the set of blocks of an LDAGP \(G'\), where each block \(b\in \mathcal {B}\) is represented as an ordered tuple of edges \((e_0,\dots ,e_n)\). For compaction, an auxiliary *block graph* is constructed as seen in Fig. 2c. Blocks are the nodes in the block graph and are connected by an edge if two nodes of different blocks are consecutive in their layer. Within the block graph, blocks are divided into *classes*. A class is defined by a unique sink that is reachable by all of the class’s nodes. Positions subject to a global separation value \(\delta _g\) are then assigned to blocks using a longest path layering within each class, which recursively assigns positions relative to the class’s sink. If two adjacent blocks are part of the same class, their relative positions can be determined immediately. If they belong to different classes, the blocks impose a minimum required separation between the involved classes. This separation is remembered and applied after all blocks have been placed.

As mentioned earlier, the original approach does not cater for varying node sizes and ports. For one thing, ports reveal two problems that are illustrated in Fig. 3a. First, in the depicted graph no edge is drawn straight even though all nodes of the blocks B1 and B2 are neatly left-aligned. Second, node n1 has two ports both of which would allow the connected edge to be drawn straight. Yet, n1 and n4 are part of different blocks that will be separated during the compaction step. In addition, different node sizes increase the two aforementioned problems and render the global separation value \(\delta _g\) impractical. \(\delta _g\) would have to be larger than the widest node of the graph to avoid overlapping nodes, possibly leaving a lot of whitespace. Figure 3b and c show two drawings that would be more desirable using a local separation \(\delta _l\). Furthermore, in conjunction with orthogonally drawn edges, as opposed to general polylines, the balancing step often yields undesirable bendpoints (see Fig. 3d). For this reason we consider the balancing step to be optional and, if discarded, choose the final layout out of the four possible candidates based on the smallest width.

*root*[

*v*] denotes the root node of

*v*’s block;

*align*[

*v*] maps to the next node within

*v*’s block in the current iteration direction and represents a cyclically linked list;

*sink*[

*v*] stores the sink of the class

*v*belongs to;

*shift*[

*v*] holds the distance by which the class of

*v*should be moved during compaction.

## 4 Size- and Port-Aware Node Coordinate Assignment

*inner shift*. It calculates offsets for nodes within a block to account for ports and simultaneously determines the width of the blocks, which is required to calculate the size of a layout. Second, we extend the compaction phase to consider node sizes when calculating explicit x-coordinates. Third, we modify the objective such that more straight edges are, to a certain extent, favored over achieving the most compact layout possible. All additions integrate seamlessly with the original algorithm and preserve its linear execution time.

*Node Size and Port Support.* The original algorithm assigns the same x-coordinate to all nodes within a block. This automatically yields straight edges if all nodes have the same size and the same attachment points for edges. Here we extend this in two ways. First, blocks have a width that depends on the sizes of the block’s nodes. Second, each node has an *inner shift*, which is an offset relative to a block’s left border. The inner shift is used to properly deal with ports.

Given a set of blocks \(\mathcal {B}\) calculated by the vertical alignment method of the original algorithm, we execute Algorithm 1. For each block, it iterates through the block’s edges (*p*, *q*), considers *p* to be fixed and determines an offset value for \(\pi (q)\) such that (*p*, *q*) can be drawn straight. Additionally, the maximum extent of the nodes to either side of the starting node’s leftmost coordinate is recorded. Using these values, the size of each block is calculated and all inner shift values are shifted to be relative to the leftmost coordinate of any of the block’s nodes. The block size is used to determine the width of each extremal layout. Figure 4 illustrates the effect of the inner shift.

Given an inner shift for the nodes of each block, the horizontal compaction technique is applied with the alterations seen in lines 10 and 13 of Algorithm 2. Contrary to the original method, the inner shift and the width of the nodes are considered while iterating through the block. Note that we consider the individual width of every node and do not use the overall width of a block. This allows blocks to “flow” into each other, as seen in Fig. 3b.

Moreover, the inner shift of a node and its size have to be considered during the final balancing step, which is easy to incorporate into the original algorithm.

*Improving Straightness.* A wider node can allow for more than one edge to be drawn straight. The original algorithm did not have to address this since nodes were considered to be uniform. We solve this as follows. Remember that our extended compaction step as shown in Algorithm 2 compacts blocks and classes as much as possible. This implies that for a given iteration direction only such edges are possible candidates for additional straightening where one of the involved blocks was moved “too far” (for instance node n4 in Fig. 3b). In other words, we have to prevent the blocks of such edges to be compacted too far in order to get more straight edges.

The procedure we apply can be seen in Fig. 5. In (a) everything is compacted as much as possible. In (b) a threshold value thresh is used to prevent node n5 from moving further to the left, resulting in a straight edge.

*Execution Time.* For an LDAGP \(G'\), the original algorithm runs in time linear to the number of nodes and edge segments, \(O(|V'| + |E'|)\). Algorithm 1 is linear in the number of edge segments that are involved in blocks. Algorithm 2 only adds constant time operations to the procedure of the original algorithm. Algorithm 3 additionally calculates the threshold value which influences which edge will be picked later. To pick an edge, for every node the incident edge segments are touched at most once. Adding elements to and removing them from a queue can be done in constant time and the post processing step is bounded by the number of nodes and edge segments. Therefore, the overall execution time remains linear in the number of nodes and edge segments.

## 5 Evaluation

All drawings seen in Fig. 1 were created using the methods of Schulze et al. [5] in combination with our extensions. The methods are implemented in the KLay Layered algorithm and the drawings are created using the KLighD framework, both of which are part of the KIELER open source project.^{1}

^{2}, (3) data flow diagrams from the commercial interactive model browsing solution EHANDBOOK

^{3}, and (4) SCGs, which are specialized control flow graphs for sequentially constructive programs [7]. The Ptolemy and EHANDBOOK diagrams are meant to be navigated using an expand/collapse mechanism. Figure 6 shows a diagram with both an expanded hierarchical node and a collapsed one. Scenarios where more than one edge can be drawn straight are more likely in the presence of expanded nodes as they are wider. We therefore fully expanded existing diagrams for our evaluation and then extracted each hierarchical level into a separate diagram. KLay supports hierarchical graphs by introducing additional ports where an edge crosses a hierarchy boundary, see for instance Fig. 6. The layout is then performed in a bottom-up fashion and additional ports are considered to be dummy nodes. After the evaluations we realized that the aforementioned extraction of subdiagrams kept several edges from being drawn straight since additional ports were fixed at disadvantageous positions. We believe the results could be better than reported.

Summary of the evaluation data. For each diagram type the number of diagrams *d* is listed alongside the average number of nodes \(\bar{n}\) and edges \(\bar{m}\) per diagram. *IE* is the percentage increase for BKS compared to BK in the overall sum of all diagrams’ straight edges. *IS* indicates the increase of the average diagram size. By size we mean the width of top-down drawings and the height of left-right drawings. *ID* represents the number of diagrams for which BKS found more straight edges than BK. *OBK* and *OBKS* represent the number of diagrams for which BK and BKS found the optimum number of straight edges.

Type | | \(\bar{n}\) | \(\bar{m}\) | | | | | |
---|---|---|---|---|---|---|---|---|

Random | 106 | 29.5 | 46.5 | 0.1 | 0.0 | 4.7 | 58.5 | 60.4 |

SCGs | 107 | 134.4 | 268.7 | 3.5 | 0.0 | 96.3 | 2.8 | 47.7 |

EHANDBOOK | 97 | 21.6 | 24.1 | 3.7 | 2.1 | 18.6 | 58.8 | 66.0 |

Ptolemy | 1140 | 10.6 | 13.7 | 2.3 | 0.2 | 15.4 | 74.6 | 87.0 |

Table 1 summarizes the characteristics of each type of diagram and Fig. 7 shows a scatter plot for each one of them. It can be seen that for diagrams with same-sized nodes BK finds optimal or near-optimal solutions. The other three plots indicate that while BK’s overall performance is still very good, there are diagrams for which the number of straight edges can be improved. This is due to variable node sizes. BKS performs better here. The overall number of straight edges increases as well as the number of diagrams for which an optimum solution is found. For SCGs BKS produces more straight edges for almost every diagram. The average width of the tested diagrams on the other hand does not increase notably, which implies that for the tested graphs the additionally straightened edges did not negatively affect the width.

*Execution Time.* We measured the execution time of BK and BKS using randomly generated graphs with 40 different node counts between 10 and 1000, 1.5 edges per node, and node widths varying between 20 and 100. For each graph size, we generated 10 random graphs and ran the algorithm 10 times, using the average execution time as result. The tests were executed using a 64 bit JVM on a laptop with an Intel i7 2 GHz CPU and 8 GB memory.

For graphs with up to 100 nodes both strategies finish in under 2.5ms and require about 62ms for graphs with 1000 nodes. The average difference between BK and BKS is below 1ms. Therefore, both strategies are fast enough to be used in interactive modeling and browsing tools.

## 6 Final Remarks

We presented extensions to the node placement algorithm presented by Brandes and Koepf [1] to support different node sizes and ports. These extensions make the algorithm usable for a wider range of diagram types, including data flow diagrams. We evaluated our extensions on randomly generated diagrams as well as on three sets of real-world diagrams and found that the results often were near the optimum in terms of straight edges and compactness does not suffer. Performance-wise, the algorithm fares well enough to be used in interactive applications.

For certain graphs, straightening edges may still lead to less compact diagrams. Our intuition is that drawing very few edges in a given diagram non-straight would often lead to a more compact layout. Future work could go into confirming or refuting this intuition and developing methods to find such edges.

## Footnotes

## Notes

### Acknowledgements

This work was supported by the German Research Foundation under the project *Compact Graph Drawing with Port Constraints* (ComDraPor, DFG HA 4407/8-1).

## References

- 1.Brandes, U., Köpf, B.: Fast and simple horizontal coordinate assignment. In: Mutzel, P., Jünger, M., Leipert, S. (eds.) GD 2001. LNCS, vol. 2265, p. 31. Springer, Heidelberg (2002) CrossRefGoogle Scholar
- 2.Buchheim, C., Jünger, M., Leipert, S.: A fast layout algorithm for \(k\)-level graphs. In: Marks, J. (ed.) GD 2000. LNCS, vol. 1984, pp. 229–240. Springer, Heidelberg (2001) CrossRefGoogle Scholar
- 3.Healy, P., Nikolov, N.S.: Hierarchical drawing algorithms. In: Tamassia, R. (ed.) Handbook of Graph Drawing and Visualization, pp. 409–453. CRC Press, Boca Raton (2013)Google Scholar
- 4.Sander, G.: A fast heuristic for hierarchical Manhattan layout. In: Brandenburg, F.J. (ed.) GD 1995. LNCS, vol. 1027, pp. 447–458. Springer, Heidelberg (1996) CrossRefGoogle Scholar
- 5.Schulze, C.D., Spönemann, M., von Hanxleden, R.: Drawing layered graphs with port constraints. J. Vis. Lang. Comput. Spec. Issue Diagr. Aesthet. Layout
**25**(2), 89–106 (2014)CrossRefGoogle Scholar - 6.Sugiyama, K., Tagawa, S., Toda, M.: Methods for visual understanding of hierarchical system structures. IEEE Trans. Syst. Man Cybern.
**11**(2), 109–125 (1981)MathSciNetCrossRefGoogle Scholar - 7.von Hanxleden, R., Mendler, M., Aguado, J., Duderstadt, B., Fuhrmann, I., Motika, C., Mercer, S., O’Brien, O., Roop, P.: Sequentially Constructive Concurrency–a conservative extension of the synchronous model of computation. ACM Trans. Embed. Comput. Syst. Spec. Issue Appl. Concurrency Syst. Des.
**13**(4s), 144:1–144:26 (2014)Google Scholar