Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Negation, closely related to the notion of absence, plays a crucial role in all logics. Indeed, “The capacity to negate is the capacity to refuse, to contradict, to lie, to speak ironically, to distinguish truth from falsity – in short, the capacity to be human” [8]. It has long been recognized that diagrams are sometimes unable to explicitly represent negated statements. Indeed, many of the logics based on Euler diagrams do not permit statements such as \(a\not \in P\) to be made explicitly. Instead, one has to assert that \(a\in P'\), where \(P'\) is the complement of P. There is, however, one exception to this: Choudhury and Chakraborty developed a classical logic called Venn-i that allows \(a\not \in P\) to be directly expressed [5].

Venn-i extends Shin’s Venn-I system, which includes Peirce’s \(\otimes \)-sequences to assert non-emptiness of sets [13], alongside i-sequences and \(\overline{i}\)-sequences to represent individuals and their absence. Since Choudhury and Chakraborty adopt a classical interpretation, the absence of an individual from one set implies its presence in the complement. In Fig. 1, \(D_1\) uses an i-sequence to assert \(a\in P\backslash Q\), using an a, and \(D_2\) negates this statement, expressing \(a\not \in P\backslash Q\), using an \(\overline{i}\)-sequence. Moreover, \(D_2\) is semantically equivalent to \(D_3\), which expresses \(a\in (P\cap Q) \cup (Q\backslash P) \cup ((U\backslash P) \cap (U\backslash Q))\) using an i-sequence, namely \(a-a-a\). An inspiration for Choudhury’s and Chakraborty’s work came from the notion of abhāva (absence). Abhāva, an important feature of ancient Indian knowledge systems, allocates a first class status to the absence of individuals. A philosophical account of absence can be found in [4].

Speaking from the point of view of cognitive science, absence would indicate that though we do not directly perceive the object, we do perceive its absence; there is a mental imagery of the absent object. Thus, when considering a particular individual (of which we have a mental image) we check whether it is in a particular locus and directly perceive its absence. This is reflected by the treatment of Venn-i as a classical logic, where the law of excluded middle holds, as opposed to a sort of constructivist logic where the absence of an individual from one set need not imply its presence in the complement [3].

Fig. 1.
figure 1

Asserting presence and absence.

As we will demonstrate, explicitly representing the absence of individuals allows information to be presented in a less cluttered way. Clutter in Euler diagrams, which are closely related to Venn diagrams, was studied by John et al. [11]: they devised a theoretical measure of clutter. Alqadah et al. established that increased levels of clutter in Euler diagrams negatively impacts user task performance [1]. Hence, there is clearly a need to theoretically understand clutter in diagrams generally and its impact on end-user task performance.

This paper takes the first step towards understanding clutter arising from the sequences in an extended version of Venn-i, which we call Venn-ie, by:

  • Discussing the interplay between absence and presence, as well as highlighting their asymmetry (Sect. 2),

  • Formalizing the syntax and semantics of Venn-ie, which use Euler diagrams as a basisFootnote 1 (Sect. 3),

  • Defining a measure of clutter arising from \(\otimes \)-sequences, i-sequences and \(\overline{i}\)-sequences (Sect. 4); we note here that i-sequences can comprise many nodes whereas \(\overline{i}\)-sequences always have a single node,

  • Identifying necessary and sufficient conditions for Venn-ie diagrams to be unsatisfiable (Sect. 5),

  • Demonstrating how to minimize clutter in satisfiable diagrams by defining inference rules for altering sequences (Sect. 6), and

  • Discussing the role of absence in clutter reduction and its potential implications on task performance (Sect. 7).

We conclude and discuss future work in Sect. 8.

2 Representing Absence Diagrammatically

Semantically equivalent statements can be made about the sets in which an individual lies using either positive or negative statements, such as \(a\in P\cup Q\) versus \(a\not \in P'\cap Q'\) respectively. Whilst various diagrammatic logics include syntax to explicitly make positive statements like \(a\in P\cup Q\), including [18, 19], they have overlooked the possibility of making negative statements like \(a\not \in P'\cap Q'\). One benefit of allowing diagrams to make negative statements is that less cluttered diagrams can be formed: using \(\overline{a}\) signs in diagrams can be more succinct, relative to diagrams using a signs; see Fig. 2. As previously noted, clutter can have a significant negative impact on diagram comprehension.

Fig. 2.
figure 2

Making positive (left) and negative (right) statements.

In Fig. 3, the three diagrams are semantically equivalent. We can reduce the clutter in \(D_1\) by substituting \(\overline{a}\) for the a-sequence, with the result shown in \(D_2\). As well as swapping syntax that makes positive (resp. negative) statements for syntax that makes negative (resp. positive) statements, clutter can also be reduced by removing redundant syntax. The diagram \(D_3\) has more syntax than \(D_1\), such as two additional \(\otimes \)-sequences, and is more cluttered as a result.

Fig. 3.
figure 3

Diagrams with different levels of clutter.

There are fundamental ontological differences between pieces of syntax representing presence and absence. This is because, although syntactically similar, the semantic status of a and \(\overline{a}\) signs is different. Firstly, there are differences relating to their locations within a diagram. If there are distinct i-sequences with the same label placed in disjoint regions then the diagram is inconsistent: disjoint regions represent disjoint sets and a given individual cannot be in two disjoint sets. By contrast, \(\overline{a}\) can be placed in several disjoint regions without giving rise to inconsistency per se: it is entirely possible for an individual to be absent from two disjoint sets, for instance.

Secondly, we observe that the presence of a sequence, either of the form a or \(\overline{a}\), in some region, r, carries existential import. However, this existential import behaves differently: we see that a drawn inside r implies the set, s, that r represents is not empty, whereas \(\overline{a}\) drawn in r implies the complement of s is not empty. Thus, the role of absence in terms of existential import is asymmetrical with presence. This may affect the way diagrams are understood by users.

Thirdly, the interaction of absence with subsumption may contradict intuition. In Fig. 4, \(D_1\) tells us that \(Q \subseteq P\) and \(a\not \in Q\). However, this does not imply \(a\not \in P\), so \(D_1\) does not imply \(D_2\); by contrast, \(Q\subseteq P\) and \(a\in Q\) implies \(a\in P\). This behaviour runs counter to the iconicity [17] of Euler diagrams, which are known to support inference through mechanisms such as free rides [15]. Iconicity is exploited in Euler diagrams through the way that containment indicates subsumption: elements that belong to a set represented by a contained circle belong, “naturally”, to the set represented by the containing circle. On the other hand, the absence of an individual from a set represented by a contained circle does not imply absence from the set represented by the containing circle. Thus, with regard to subsumption, \(\overline{a}\) does not behave transitively, unlike a.

Fig. 4.
figure 4

The interplay between absence and subsumption.

Fig. 5.
figure 5

Syntax: Venn-ie.

To summarize, explicitly representing the absence of individuals allows clutter to be reduced in diagrams. Moreover, we must be mindful of various ontological differences between a and \(\overline{a}\) when reasoning.

3 Syntax and Semantics of Venn-ie

Venn-ie extends Venn-i introduced in [5], relaxing the restriction to Venn diagrams by allowing Euler diagrams to be used. In turn, Venn-i extends Shin’s Venn-I system [16]. As is typical, the abstract syntax is given alongside an informal description of the concrete syntax.

Consider the Venn-ie diagram in Fig. 5. There are two closed curves, labelled P and Q. We conflate the closed curves with their labels and simply say ‘the curve P’, or just ‘P’. The curves give rise to three zones: a zone is a region inside some (possibly no) curves and outside the remaining curves. In Fig. 5, the only shaded zone is inside Q but outside P. The diagram also contains four graphs:

  1. 1.

    One \(\otimes \)-sequence which comprises a single node,

  2. 2.

    One i-sequence (i for individual), namely b, comprising two nodes joined by one edge, and

  3. 3.

    Two \(\overline{i}\)-sequences, namely \(\overline{a}\) and \(\overline{c}\), both of which comprise a single node.

Typically, the abstract syntax for an Euler diagram, D, comprises a set of labels, a set of zones, and a set of shaded zones, written \(D=(L,Z, ShZ )\). Zones are ordered pairs of finite, disjoint sets of labels, \(( in , out )\), where \( in \) (resp. \( out \)) denotes the (labels of) the curves that the zone is inside (resp. outside). The zone outside all of the curves, namely \((\emptyset ,L)\), must be in D and any zone in D satisfies \( in \cup out =L\). The set \( ShZ \) of shaded zones only contains zones in Z.

In Fig. 5, the underlying Euler diagram is \((L,Z, ShZ )\), where \(L=\{P,Q\}\), \(Z=\{(\emptyset ,\{P,Q\}),(\{P\},\{Q\}),(\{Q\},\{P\})\}\) and \( ShZ =\{(\{Q\},\{P\})\}\). The zone \((\emptyset ,\{P,Q\})\) is that which is outside all of the curves, hence the first part of the ordered pair being \(\emptyset \) and the second part containing both P and Q. As we shall see, this zone denotes the set \(P'\cap Q'\).

It is helpful for us to have a set of labels from which all labels used in any diagram are drawn; we call this set \(\mathcal {L}\). When making general statements, we take \(\mathcal {L}=\{\lambda _1,\lambda _2,...\}\) whereas in examples we use P, Q, R, and so forth. Given \(\mathcal {L}\), the set of all zones is denoted \(\mathcal {Z}\). We also have a set of constant symbols, denoted \(\mathcal {C}\), which gives rise to i-sequences and \(\overline{i}\)-sequences. We take \(\mathcal {C}=\{\iota _1,\iota _2,...\}\); in examples, we use a, b, c, and so forth.

The regions (i.e. non-empty sets of zones) in a diagram need to be associated with the sequences drawn in them. In general, \(\otimes \)-sequences and i-sequences can have nodes placed in many zones, whereas \(\overline{i}\)-sequences always have a single node. This reflects the dual role of i-sequences and \(\overline{i}\)-sequences: an a-sequence in the region \(\{z_1,...,z_n\}\) asserts that \(a\in z_1\vee ... \vee a\in z_n\) which is equivalent to \(a\not \in z_{n+1} \wedge ... \wedge a\not \in z_{n+m}\), where \(z_{n+1},...,z_{n+m}\) are the zones not in r. This equivalent statement can be made by a set of \(\overline{a}\)-sequences, one in each zone not in r. To identify the sequences in each region, we use three binary relations \(\rho _\otimes \), \(\rho _i\) and \(\rho _{\overline{i}}\). In Fig. 5, \(\rho _\otimes =\{(\{(\{P\},\{Q\})\},\otimes _1)\}\), \(\rho _{i}=\{(\{(\{Q\},\{P\}),(\emptyset ,\{P,Q\})\},b)\}\), and \(\rho _{\overline{i}}=\{((\{P\},\{Q\}),a),((\emptyset ,\{P,Q\}),c)\}.\)

Definition 1

A \({\varvec{Venn-i}}^{\varvec{e}}\) diagram, D, is a tuple, \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) such that:

  1. 1.

    L is a finite set of labels chosen from \(\mathcal {L}\).

  2. 2.

    Z is a set of zones where \((\emptyset ,L)\in Z\) and for all \(( in , out )\) in Z, \( in \cup out =L\).

  3. 3.

    \( ShZ \) is a subset of Z whose elements are called shaded zones.

  4. 4.

    \({\rho _{\otimes }\subseteq (\mathbb {P}Z\backslash \{\emptyset \})\times \{\otimes \}}\) is a finite binary relation that associates non-empty regions with \(\otimes \) symbols. The elements of \(\rho _{\otimes }\) are called \({\varvec{\otimes }}\) -sequences.

  5. 5.

    \(\rho _i \subseteq (\mathbb {P}Z\backslash \{\emptyset \})\times \mathcal {C}\) is a finite binary relation that associates non-empty regions with constant symbols. The elements of \(\rho _{i}\) are called \({\varvec{i}}\) -sequences.

  6. 6.

    \(\rho _{\overline{i}}\subseteq Z\times \mathcal {C}\) is a finite binary relation that associates zones with constant symbols. The elements of \(\rho _{\overline{i}}\) are called \({\varvec{\overline{i}}}\) -sequences.

The missing zones of D are elements of \( MZ =\{( in , out )\in \mathcal {Z}: in \cup out =L \}\backslash Z.\) Furthermore, given a constant, \(\iota \), the set of \(\iota \)-sequences in D is denoted \(I(\iota )\) where \(I(\iota )=\{(r,\iota ): (r,\iota )\in \rho _{i}\}\). Similarly, the set of \(\overline{\iota }\)-sequences in D is denoted \(I(\overline{\iota })\) where \(I(\overline{\iota })=\{(z,\iota ): (z,\iota )\in \rho _{\overline{i}}\}.\)

The underlying Euler diagrams have the typical semantics: the closed curves represent sets and their spatial relationships correspond to set-theoretic relationships. Shading asserts emptiness, as seen in Shin’s systems [16]. Sequences give information about the location of elements in sets. First, \(\otimes \)-sequences, introduced by Peirce [13], assert the non-emptiness of sets. Second, i-sequences assert that the denoted individuals are in the sets represented by the regions in which they are placed. Lastly, each \(\overline{i}\)-sequence asserts the absence of the denoted individual from the set represented by the zone in which it is placed. In Fig. 5, the b-sequence asserts that \(b\in Q\cap P'\) or \(b\in P'\cap Q'\), since b is in the two zone region \(\{(\{Q\},\{P\}),(\emptyset ,\{P,Q\})\}\). Likewise, the \(\overline{c}\)-sequence is in the zone \((\emptyset ,\{P,Q\})\) which means that \(c\not \in P'\cap Q'\). To formalize the semantics, we adopt a standard model-theoretic approach.

Definition 2

An interpretation, \(\mathcal {I}\), is a triple, \(\mathcal {I}=(U,\psi ,\varPsi )\), such that

  1. 1.

    U is a non-empty set, called the universal set,

  2. 2.

    \(\psi :\mathcal {C}\rightarrow U\) maps constants to elements in U, and

  3. 3.

    \(\varPsi :\mathcal {L}\rightarrow \mathbb {P}U\) maps curve labels to subsets of U.

The function \(\varPsi \) is extended to interpret zones as follows: for each zone, \(( in , out )\),

$$\varPsi (z)=\bigcap \limits _{l\in in }\varPsi (l) \cap \bigcap \limits _{l\in out }(U\backslash \varPsi (l)).$$

Definition 3

Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram and let \(\mathcal {I}=(U,\psi ,\varPsi )\) be an interpretation. Then \(\mathcal {I}\) is a model for D provided the following conditions all hold.

  1. 1.

    Missing Zones Condition: for each \(z\in MZ \), \(\varPsi (z)=\emptyset \).

  2. 2.

    Shaded Zones Condition: for each \(z\in ShZ \), \(\varPsi (z)=\emptyset \).

  3. 3.

    \(\otimes \) -Sequence Condition: for each \((r,\otimes )\in \rho _{\otimes }\), \(\varPsi (z)\ne \emptyset \) for some \(z\in r\).

  4. 4.

    i-Sequence Condition: for each \((r,\iota )\in \rho _{i}\), \(\psi (\iota )\in \varPsi (z)\) for some \(z\in r\).

  5. 5.

    \({\overline{i}}\) -Sequence Condition: for each \((z,\iota )\in \rho _{\overline{i}}\), \(\psi (\iota )\not \in \varPsi (z)\).

If \(\mathcal {I}\) models D then \(\mathcal {I}\) satisfies D. Diagrams with no models are unsatisfiable.

Fig. 6.
figure 6

Measuring clutter in Venn-ie diagrams.

4 Measuring Clutter

We require a measure of clutter arising from the sequences. Figure 6 shows three simple examples, all with the same underlying Euler diagram. The lefthand diagram with just one node, namely \(\overline{a}\), is less cluttered than the middle diagram. The righthand diagram is the most cluttered, since this has two nodes (both named a) and a connecting edge. Thus, to measure the clutter arising from the sequences, we count the number of nodes and the number of edges.

Definition 4

Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. The sequence clutter score for D, denoted \(\mathcal {SCS}(D)\), is

$$\mathcal {SCS}(D)=\bigg (\sum \limits _{(r,\otimes )\in \rho _{\otimes }}(2|r|-1)\bigg )+ \bigg (\sum \limits _{(r,\iota )\in \rho _{i}}(2|r|-1)\bigg )+ |\rho _{\overline{i}}|$$

The three diagrams in Fig. 6 have sequence clutter scores 1, 2, and 3 respectively. From this point forward, we simply say clutter score.

Definition 5

Let \(D_1=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Then D is minimally cluttered if there does not exist a semantically equivalent diagram, \(D'=(L,Z, ShZ ,\rho _\otimes ',\rho _i',\rho _{\overline{i}}')\), such that \(\mathcal {SCS}(D')<\mathcal {SCS}(D)\).

5 Minimizing Clutter in Inconsistent Diagrams

Figure 7 shows a minimally cluttered inconsistent diagram, namely \(D_1\): it has a clutter score of 0; thus, any inconsistent diagram is semantically equivalent to \(D_1\). To allow us to focus on consistent diagrams, when algorithmically reducing clutter, we need to identify syntactic conditions which capture inconsistency. There are various ways in which Venn-ie diagrams can be inconsistent:

Fig. 7.
figure 7

Inconsistent Venn-ie diagrams.

  1. 1.

    All interpretations have a non-empty universal set, so a diagram is inconsistent if it is entirely shaded. See \(D_1\) in Fig. 7.

  2. 2.

    Shaded regions containing entire \(\otimes \)-sequences or i-sequences are inconsistent since the shading asserts set emptiness whereas the sequence implies set non-emptiness. See \(D_2\) in Fig. 7, where each sequence gives rise to inconsistency.

  3. 3.

    There are i-sequences placed in regions that do not share a common non-shaded zone, z, where z does not contain an \(\overline{a}\)-sequence. Intuitively, for each \(\iota \), the individual represented must lie in the set denoted by a non-shaded zone that is shared by all \(\iota \)-sequences in \(I(\iota )\). If all such zones include \(\overline{\iota }\) then the diagram also asserts that the individual is absent from the sets represented by those zones. See \(D_3\) in Fig. 7.

  4. 4.

    The set of \(\overline{i}\)-sequences for constant symbol \(\iota \), namely \(I(\overline{\iota })\), cannot include all non-shaded zones. If all non-shaded zones were included then the law of excluded middle tells us that the represented individual must be an element of the empty set. See \(D_4\) in Fig. 7.

Definition 6

(Inconsistency). Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Whenever any one of the following conditions holds D is inconsistent.

  1. 1.

    All zones are shaded: \(Z = ShZ \).

  2. 2.

    There is an \(\otimes \)-sequence, say \((r,\otimes )\), in D such that \(r \subseteq ShZ \).

  3. 3.

    There is an i-sequence, say \((r,\iota )\), in D such that for all zones, z, in \(\bigcap \limits _{r'\in I(\iota )} r'\), either z is shaded or \((z,\iota )\in I(\overline{\iota })\).

  4. 4.

    There is an \(\overline{i}\)-sequence, say \((z,\iota )\), in D such that \(Z\backslash \{z': (z',\iota )\in I(\overline{\iota })\}\subseteq ShZ \).

If D is not inconsistent then D is consistent.

Theorem 1

(Inconsistent). D is inconsistent iff D is unsatisfiable.

Using Theorem 1 we can therefore identify whether any given diagram is inconsistent. Given such a diagram \(D=(L,Z,Z,\rho _{\otimes },\rho _i,\rho _{\overline{i}})\) we can see that a minimally cluttered, semantically equivalent diagram is \(D_{ min }=(L,Z,Z,\emptyset ,\emptyset ,\emptyset )\).

6 Minimizing Clutter in Consistent Diagrams

The goal of this section is to produce minimally cluttered diagrams using inference rules that alter their sequences. To this end, we first define some useful transformations on diagrams.

Transformation 1

(Sequence Removal). Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Let \((r,\bullet )\) be a sequence in D. We define three removal operations on D:

  1. 1.

    If \((r,\bullet )\in \rho _\otimes \) then \(D-(r,\bullet )=(L,Z, ShZ ,\rho _\otimes \backslash \{(r,\bullet )\},\rho _i,\rho _{\overline{i}}).\)

  2. 2.

    If \((r,\bullet )\in \rho _i\) then \(D-(r,\bullet )=(L,Z, ShZ ,\rho _\otimes ,\rho _i\backslash \{(r,\bullet )\},\rho _{\overline{i}})\).

  3. 3.

    If \((r,\bullet )\in \rho _{\overline{i}}\) then \(D-\overline{(r,\bullet )}=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}}\backslash \{(r,\bullet )\}).\)

Transformation 2

(Sequence Addition). Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Let \((r,\bullet )\) be a sequence such that \(r\subseteq Z\) or \(r\in Z\). We define three addition operations on D:

  1. 1.

    If \(r\subseteq Z\) and \(\bullet =\otimes \) then \(D+(r,\bullet )=(L,Z, ShZ ,\rho _\otimes \cup \{(r,\bullet )\},\rho _i,\rho _{\overline{i}}).\)

  2. 2.

    If \(r\subseteq Z\) and \(\bullet \in \mathcal {C}\) then \(D+(r,\bullet )=(L,Z, ShZ ,\rho _\otimes ,\rho _i\cup \{(r,\bullet )\},\rho _{\overline{i}})\).

  3. 3.

    If \(r\in Z\) and \(\bullet \in \mathcal {C}\) then \(D+\overline{(r,\bullet )}=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}} \cup \{(z,\bullet )\}).\)

Fig. 8.
figure 8

Clutter reduction in consistent diagrams.

Before we present our inference rules, we work through an example showing how to minimize clutter. Consider Fig. 8. Here, the diagram D is consistent, but not minimally cluttered. To reduce clutter, we make various observations and adopt the following process:

  1. 1.

    First we observe that whenever we express information using \(\overline{i}\)-sequences, we can instead use an i-sequence. Thus, in D we can swap \(\overline{a}\) for an a-sequence, as shown in \(D_1\). In general, this swap may result in the clutter score increasing, but it allows us to more easily identify, syntactically, the region in which a must represent an element.

  2. 2.

    Next, we observe that in \(D_1\) (and, in any diagram), we only need one occurrence of each constant symbol to specify in which set it lies. So, we can reduce the three a-sequences in \(D_1\) to a single a-sequence shown in \(D_2\). This single a-sequence is placed in the zone common to all of the a-sequences in \(D_1\), thus allowing us to see which region contains the individual a. In this step, the clutter score of \(D_2\) is lower than that of \(D_1\).

  3. 3.

    Reductions can also be made to sequences that are placed in regions which contain shaded zones, since shaded zones represent empty sets. The diagram \(D_2\) contains two such sequences, \((b,r_b)\) and \((\otimes ,r_\otimes )\), and can be replaced by \(D_3\).

  4. 4.

    Some sequences can be redundant from diagrams. In \(D_3\), the \(\otimes \)-sequence is redundant since it tells us that \(Q\backslash (P\cup R)\ne \emptyset \) which can be deduced from the a-sequence. So \(D_3\) can be replaced by \(D_4\).

  5. 5.

    Lastly, we examine each i-sequence in turn. If its contribution to the clutter score can be reduced by swapping it for \(\overline{i}\)-sequences then this swap is performed. Here, the b-sequence is swapped for two \(\overline{b}\)-sequences, resulting in \(D_{min}\). This last step exploits the use of absence to reduce diagram clutter.

As we have just seen, it is possible to swap i-sequences for \(\overline{i}\)-sequences, and vice versa, reflecting their dual roles. For example, in Fig. 9, the a-sequence in \(D_1\) tells us \(a\in P\cap Q'\cap R'\) or \(a\in P\cap Q\cap R'\). Given the shading and the spatial relationships between the curves, asserting \(a\not \in P'\cap Q'\cap R'\) is equivalent. This alternative representation is seen in \(D_2\). We can swap the i-sequence \((\{(\{P\},\{Q,R\}),(\{P,Q\},\{R\})\},a)\) for the \(\overline{i}\)-sequence \(((\emptyset ,\{P,Q,R\}),a)\).

Fig. 9.
figure 9

Swapping sequences.

Inference Rule 1

(Swap i -Sequence). Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Let \((r,\iota )\) be an i-sequence in D. Then \((r,\iota )\) may be swapped for the set \(\{(z,\iota ): z\in Z\backslash ( ShZ \cup r)\}=\{(z_1,\iota ),...,(z_n,\iota )\}\) of \(\overline{i}\)-sequences. That is, D may be replaced by \(D-(r,\iota )+\overline{(z_1,\iota )}+...+\overline{(z_n,\iota )}\) and vice versa.

Inference Rule 2

(Swap \(\overline{i}\) -Sequences). Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Let \(\iota \) be a constant symbol such that \(I(\overline{\iota })\ne \emptyset \) and \(Z\backslash ( ShZ \cup \{z_1,...,z_n\})\ne \emptyset \), where \(I(\overline{\iota })=\{(z_1,\iota ),...,(z_n,\iota )\}\). Then \(I(\overline{\iota })\) may be swapped for the i-sequence \((Z\backslash ( ShZ \cup \{z_1,...,z_n\}),\iota )\). That is, D may be replaced by

$$D-\overline{(z_1,\iota )}-...-\overline{(z_n,\iota )}+(Z\backslash ( ShZ \cup \{z_1,...,z_n\}),\iota )$$

and vice versa.

There are also occasions when we can remove parts of sequences: when the region in which a sequence is placed includes a shaded zone, the part in the shaded zone can be deleted, thus reducing the sequence. Moreover, we have also seen that sets of i-sequences can be reduced.

Inference Rule 3

(Reduce Sequence). Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Let \((r,\bullet )\) be a sequence in D such that r contains at least two zones, one of which, z say, is shaded. Then D may be replaced by \(D-(r,\bullet )+(r\backslash \{z\},\bullet )\) and vice versa. Such a sequence is said to be reducible in D.

Inference Rule 4

(Reduce a Set of Sequences). Let \(D=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram. Let \(\iota \) be a constant symbol such that \(I(\iota )\ne \emptyset \) and \(r\ne \emptyset \) where

$$r=\big (\bigcap \limits _{(r_i,\iota )\in I(\iota )}r_i\big )\backslash \big ( ShZ \cup \bigcup \limits _{(z,\iota )\in I(\overline{\iota })}\{z\}\big ).$$

Then D may be replaced by

$$D-(r_1,\iota )-...-(r_n,\iota )+(r,\iota )$$

and vice versa, where \(I(\iota )=\{(r_1,\iota ),...,(r_n,\iota )\}\). The set \(I(\iota )\) of sequences is said to be reducible in D.

There are various ways in which an \(\otimes \)-sequence can be redundant in a diagram, in the sense that its removal does not alter the semantics:

  1. 1.

    An \(\otimes \)-sequence, \((r,\otimes )\), that includes all of the non-shaded zones in r is redundant, since this amounts to asserting that \(U\ne \emptyset \) which is necessarily true in all interpretations.

  2. 2.

    In \(D_1\), Fig. 10, the single-node \(\otimes \)-sequence asserts \(P\cap Q'\cap R'\ne \emptyset \). From this we can deduce \(P\ne \emptyset \), asserted by the two-node \(\otimes \)-sequence which is, thus, redundant.

  3. 3.

    In \(D_1\), the a-sequence tells us that \(a\in P\cap Q'\cap R'\) or \(a\in P'\cap Q'\cap R'\). The shading asserts \(P'\cap Q'\cap R'=\emptyset \), so \(a\in P\cap Q'\cap R'\). This implies that \(P\cap Q'\cap R'\ne \emptyset \), so the single-node \(\otimes \)-sequence is also redundant. The presence of the individual a has permitted a reduction in diagram clutter.

  4. 4.

    Lastly, in \(D_1\) the location of \(\overline{b}\) tells us that \(b\not \in P\cap Q'\cap R'\), from which – together with the shading – it follows that \(b\in Q\cup R\). Therefore, the four-node \(\otimes \)-sequence asserting \(Q\cup R\ne \emptyset \) is redundant. The absence of the individual b has permitted a reduction in diagram clutter.

Removing the \(\otimes \)-sequences from \(D_1\) in Fig. 10 to give \(D_2\) reduces the clutter score from 15 to 4.

Fig. 10.
figure 10

Redundant \(\otimes \)-sequences.

Fig. 11.
figure 11

Redundant i-sequences.

Inference Rule 5

(Remove \(\otimes \) -Sequence). Let \(D_1=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram and let \((r,\otimes )\) be an \(\otimes \)-sequence in D such that either:

  1. 1.

    The region r includes all non-shaded zones: \(Z\backslash ShZ \subseteq r\),

  2. 2.

    There is a distinct \(\otimes \)-sequence, say \((r',\otimes )\), in D where \(r'\backslash ShZ \subseteq r\),

  3. 3.

    There is an i-sequence, say \((r',\iota )\), in D such that \(r'\backslash ShZ \subseteq r\), or

  4. 4.

    Given \(I(\overline{\iota })=\{(z_1,\iota ),...,(z_n,\iota )\}\), it is the case that \((Z\backslash ( ShZ \cup \{z_1,...,z_n\}))\subseteq r\).

Then D can be replaced by \(D-(r,\otimes )\) and vice versa and we say \((r,\otimes )\) is redundant in D.

Considering i-sequences, in Fig. 11 two of them are redundant in \(D_1\):

  1. 1.

    The a-sequence with a node in the shaded region tells us that \(a\in P' \cap Q' \cap R'\). From this, we can deduce that \(a\in P' \cap Q'\cap R'\) or \(a\in P \cap Q'\cap R'\), asserted by the other a-sequence, so this second a-sequence is redundant.

  2. 2.

    The presence of the two \(\overline{b}\)-sequences, together with the shading, allows us to infer that \(b\in P\cap Q'\cap R'\) or \(b\in P \cap Q \cap R'\), expressed by the b-sequence. Thus, the b-sequence is redundant. Again, we see that the absence of the individual b has permitted a reduction in diagram clutter.

Removing the i-sequences from \(D_1\) reduces the clutter score from 11 to 5 in \(D_2\).

Inference Rule 6

(Remove i -Sequence). Let \(D_1=(L,Z, ShZ ,\rho _\otimes ,\rho _i,\rho _{\overline{i}})\) be a Venn-ie diagram and let \((r,\iota )\) be an i-sequence in D such that either:

  1. 1.

    the region r includes all non-shaded zones: \(Z\backslash ShZ \subseteq r\),

  2. 2.

    there is a distinct i-sequence, \((r',\iota )\), in D such that \(r'\backslash ShZ \subseteq r\), or

  3. 3.

    there is a set of \(\overline{i}\)-sequences, say \(I=\{(z_1,\iota ),...,(z_n,\iota )\}\) such that \((r\cup \{z_1,...,z_n\})\backslash ShZ =Z\backslash ShZ \).

Then D can be replaced by \(D-(r,\iota )\) and vice versa and we say \((r,\iota )\) is redundant in D.

Importantly, all inference rules preserve semantics. In addition, other than the swap rules, applying them never increases diagram clutter. These two properties are captured in Theorem 2.

Theorem 2

(Soundness and Clutter Reduction). Let D and \(D'\) be a Venn-ie diagrams such that \(D'\) is obtained from D by applying one of the inference rules. Then D and \(D'\) are semantically equivalent and if the inference rule applied was not a swap rule then the clutter score of \(D'\) is at most that of D.

We are now in a position to show how to minimize clutter in consistent diagrams. Algorithm 1 presents the steps in detail. Referring to Fig. 8, the input to Algorithm 1 is D. Step 1 iteratively removes \(\overline{i}\)-sequences using inference rule 2, of which D has just one (namely \(\overline{a}\)), to give \(D_1\). Step 2 iteratively reduces sets of i-sequences using inference rule 4. In this case, the set of a-sequences is reducible and the result is shown in \(D_2\). Taking \(D_2\), step 3 reduces all reducible sequences using inference rule 3; here the result is \(D_3\), where two sequences have altered due to the presence of shading. Step 4 proceeds to remove redundant sequences using inference rule 5, resulting in \(D_4\). Lastly, step 5 inspects the i-sequences to see whether clutter is reduced by swapping them for \(\overline{i}\)-sequences. In this case, it is beneficial to swap b for two \(\overline{b}\)s: the b-sequence contributes 5 to the clutter score, whereas the (swapped) \(\overline{b}\)-sequences in \(D_{ min }\) contribute just 2. However, the a-sequence contributes only 1 to the clutter score of \(D_4\), so is retained, not swapped. \(D_{ min }\) is the output from Algorithm 1. Lastly, we note that minimally cluttered diagrams are not, in general, unique. It should be clear from the last step of Algorithm 1 that it is sometimes possible to swap sequences without altering the clutter score.

Theorem 3

(Clutter Minimization). Let D be a consistent Venn-iediagram and let \(D_{ min }\) be the result of applying Algorithm 1 to D. Then D and \(D_{ min }\) are semantically equivalent and \(D_{ min }\) is minimally cluttered.

The proof can be found online [2].

figure a

7 Cognitive Implications

As we have seen, it is possible to reduce clutter in a diagram by removing sequences, reducing them and swapping between i-sequences and \(\overline{i}\)-sequences. Whilst earlier research into diagram clutter has established that increasing clutter levels correlates with decreased task performance, it is unclear whether and when this remains true for Venn-ie diagrams. We conjecture that the impact of clutter on task performance will be task dependent.

For instance, consider the semantically equivalent diagrams in Fig. 8 and suppose that we are asked to determine the set in which the individual a lies. We conjecture that this task is easier to perform by studying \(D_{ min }\) than by studying D. This is because a is more salient in \(D_{ min }\), due to the reduced amount of syntax present: this could make it quicker to identify the location of a. Thus, for this task, it could be that \(D_{ min }\) promotes improved task performance.

Suppose now that our task is to determine whether b is not in P. \(D_{ min }\) explicitly represents this information using absence (i.e. \(\overline{b}\)), whereas it must be deduced from D: identify the location of b and deduce that b is not in P. Here, we conjecture that the use of absence has directly aided performance. Indeed, there are other tasks for which neither D nor \(D_{ min }\) are potentially ‘optimal’. For example, suppose we wish to determine the set in which b lies. Perhaps the best representation of this information is \(D_4\), which includes a three-node b-sequence (by contrast, D, \(D_1\), \(D_2\) and \(D_3\) are more cluttered). From \(D_4\), we can read off the fact that either b is in just R, b is in just Q, or b is in none of P, Q, and R.

In summary, these examples demonstrate that the diagram that best supports task performance need not be that which is minimally cluttered. There is likely to be trade-off between clutter and directly representing statements of interest, using either absence or presence information. There is clearly an interplay between diagram clutter, the use of syntax to represent presence versus absence and task performance. It is an interesting avenue of future work to explore, empirically, the relationship between diagram clutter and the directness of information representation with respect to task performance.

8 Discussion and Conclusion

In this paper we have explored the potential cognitive benefits of directly representing the absence of individuals in Euler diagram logics. Through identifying sound inference rules, and conditions under which diagrams are inconsistent, we have been able to algorithmically produce minimally cluttered Venn-ie diagrams. As a consequence, it is possible to represent information about sets and their elements in a minimally cluttered way. The inspiration for this research was derived from related work on Euler diagrams which established that increasing levels of clutter diminished task performance. Our discussion above highlights that the case for reducing clutter in Venn-ie diagrams, as a way of improving task performance, is less clear cut. Our results lay an essential foundation for empirically evaluating the impact of clutter from this perspective.

As well as empirical research, future work also includes considering clutter and absence in non-classical logics. In our interpretation of Venn-ie, \(\overline{a}\) is syntactic sugar of which we have made use for its practical ability to reduce clutter. There are two other (non-classical) interpretations of Venn-ie, explored in Choudhury and Chakrabory’s work [3]:

  1. 1.

    The absence of a in P does not necessarily imply a is in the complement of P, and

  2. 2.

    The universe is open, so the complement of P does not exist.

In our opinion, the two alternative interpretations are interesting from the point of view of the philosophy and logic of diagrams, and we plan to make them the subject of future work. In the first interpretation, we can represent recursively enumerable sets, which have many important applications in computer science and elsewhere. In the second interpretation since \(P'\) does not exist it is also the case that \(\overline{a} \in P\) does not imply \(a \in P'\). The implications of diagrammatic reasoning with an open universe is an interesting and open topic. Lastly, the use of absence could be incorporated into other Euler-diagram-based logics, such as spider diagrams [9], Euler/Venn diagrams [19], constraint diagrams [7] and concept diagrams [10].