# Multi-part balanced incomplete-block designs

## Abstract

We consider designs for cancer trials which allow each medical centre to treat only a limited number of cancer types with only a limited number of drugs. We specify desirable properties of these designs, and prove some consequences. Then we give several different constructions. Finally we generalize this to three or more factors, such as biomarkers.

## Keywords

Balanced incomplete-block design Basket trial Multi-part design Orthogonal array Orthogonal multi-array## Mathematics Subject Classification

05B05 05B15 05B30 62K10 62K15## 1 First design problem

### 1.1 The problem

This problem was posed by Valerii Fedorov at the workshop on *Design and Analysis of Experiments in Healthcare* held at the Isaac Newton Institute for Mathematical Sciences at Cambridge, UK in July 2015. The context is *basket trials*, where several different drugs are tested on several different diseases in a single protocol which involves many medical centres: see Derhaschung et al. (2016) and Woodcock and LaVange (2017). The combinatorial properties listed below have been proposed by Fedorov and Leonov (2019) as potentially giving optimal designs, which may give a benchmark for designs which are achievable in practice.

A trial is being designed to compare several drugs for their effects on several different types of cancer. In order to keep the protocol simple for each medical centre involved, it is proposed to limit each medical centre to only a few of the cancer types and only a few of the drugs. For each cancer type at that medical centre, each patient will be allocated to one of the drugs at that medical centre, the aim being that the numbers of such patients on each drug are nearly equal.

*b*the number of medical centres. The properties listed below are desirable. The first two are to keep the protocol simple. Fedorov and Leonov (2019) propose several statistical models for the response of each patient. The simplest is additive in the effects of medical centre, cancer type and drug. It is not known

*a priori*how many suitable patients will enrol at each medical centre. If there are the same number at each medical centre then conditions (c)–(e) give a design that is optimal in the sense of minimizing the variances of the estimators of parameters of interest: see Sect. 1.4.

- (a)
all medical centres involve the same number, say \(k_1\), of cancer types, where \(k_1<v_1\);

- (b)
all medical centres use the same number, say \(k_2\), of drugs, where \(k_2<v_2\);

- (c)
each pair of distinct cancer types are involved together at the same non-zero number, say \(\lambda _{11}\), of medical centres;

- (d)
each pair of distinct drugs are used together at the same non-zero number, say \(\lambda _{22}\), of medical centres;

- (e)
each drug is used on each type of cancer at the same number, say \(\lambda _{12}\), of medical centres.

Conditions (a) and (c) specify that the design for cancer types is a balanced incomplete-block design, also known as a 2-design, or, more specifically, a 2-\((v_1,k_1,\lambda _{11})\) design. Likewise, conditions (b) and (d) specify that the design for drugs is a 2-design. We call these the C-design and the D-design respectively.

We shall call a design satisfying conditions (a)–(e) a *2-part 2-design* or *2-part balanced incomplete-block design*. These are not the same as the bipartite designs defined by Hoffman and Liatti (1995).

### 1.2 Previous work

In Sect. 2 we concentrate on designs with only two different factors (cancer types and drugs), before generalizing to three or more factors in Sect. 3. This is partly to help the reader to become familiar with the ideas, and partly because this case seems likely to be of practical importance in the clinical context described.

The more general case has already been considered by Sitter (1993), Mukerjee (1998) and Hedayat et al. (1999, Sect. 10.8). Because conditions (a)–(d) specify balanced incomplete-block designs and condition (e) is reminiscent of the definition of orthogonal multi-array given by Brickell (1984), Sitter (1993) called these designs *balanced orthogonal multi-arrays*. Brickell’s original definition was essentially a generalization of orthogonal arrays of strength two and minimal size, so it included the conditions that *b* is a square and \(\lambda _{12}=1\). Sitter (1993) acknowledged that he was removing those conditions.

However, the original definition of orthogonal multi-array continues to be in use in many areas. They give an alternative definition of semi-Latin squares: see Bailey (1992) and Soicher (1999, 2013). Dually, they are used in factorial designs: see Bailey (2011). Phillips and Wallis (1996) used them in the study of tournaments. They are used in cryptography: see, for example, Anthony et al. (1990) and Martin et al. (1992). Recently, Li et al. (2015) have generalized them to strength *t*, so that *b* is a *t*-th power of an integer. This generalization seems to be within the spirit of the original definition, whereas Sitter’s does not.

Thus we think that “2-part 2-design” (or, more generally, a multi-part 2-design) is a more suitable name.

Sitter (1993) also allowed the block size within each factor to vary. Mukerjee (1998) called the balanced orthogonal multi-arrays *proper* when this is not allowed. He also restricted attention to the case where \(k_i< v_i\), unlike Sitter (1993). Both allowed \(\lambda _{ii}\) to be zero, which permits confounding: in Table 1 of Mukerjee (1998) one factor has its levels confounded with blocks.

Mukerjee (1998) gave two general constructions for designs of this type. We shall comment on the relationship of these to our constructions at the relevant places.

### 1.3 Representing the designs

*i*,

*j*) for which the combination of cancer type

*i*and drug

*j*occurs in that block. Figure 2 shows the design in Fig. 1 in this format. This dual representation does not extend easily to the generalization of the problem in Sects. 3–4.

The most concise way to represent the design is simply to list, for each block, the cancer types and drugs allocated to it. This list has \(b(k_1+k_2)\) items. This representation was used by Sitter (1993) and Mukerjee (1998). Figure 3 gives the concise representation of the design in Fig. 1.

We shall use the concise representation for the remainder of this paper. However, it can be misinterpreted when removed from the practical context. For example, the reader might think that Block 1 in Fig. 3 contains five treatments, those in the union of the sets \(\{\text{ C1 },\text{ C2 },\text{ C3 }\}\) and \(\{\text{ D1 },\text{ D5 }\}\), rather than the six treatment combinations in the cartesian product of these sets. This misinterpretation gives a block design for \(v_1+v_2\) treatments in *b* blocks of size \(k_1+k_2\), which we call the zipped form of the original design.

Figure 1 avoids this problem, but at the cost of repeating the information about the drugs in each block. This format contains \(bk_1k_2\) items, as many as the full representation, but it seems easier to read.

In the literature about block designs, the incidence matrix has (*i*, *j*)-entry equal to the number of times that treatment *i* occurs in block *j*: see, for example, John and Williams (1995); Caliński and Kageyama (2000); Bailey and Cameron (2009). Let \(N_1\) be the \(v_1 \times b\) incidence matrix of cancer types in blocks in the zipped form of the design. The (*i*, *j*)-entry is 1 if cancer type *i* occurs in block *j*; otherwise, it is 0. Let \(N_2\) be the analogous \(v_2 \times b\) incidence matrix for drugs in blocks. Then the incidence matrices for the full design are \(k_2N_1\) and \(k_1N_2\) respectively, not allowing for the unknown number of times that each combination will eventually be used in any block.

### 1.4 Comparison with other designs

At first sight, the design in Fig. 1 appears to be a block design for two treatment factors *C* and *D*. However, there are important differences between this and previous designs. In our application, the medical centre represented by Block 1 will accept into the trial only patients with cancer types 1, 2 or 3. It has no control in advance over how many such patients will present themselves. For each of these three cancer types, it will randomize approximately equal numbers of patients to drugs 1 and 5. In the original proposal, the listed drugs include placebo. In a later variants, placebo is not listed, and patients should be randomized approximately equally to drugs 1 and 5 and placebo, or approximately one quarter each to placebo, drug 1, drug 5 and their combination.

Sitter (1993) introduced his designs for use in sampling. In designed experiments, Mukerjee (1998) envisaged a completely different sort of application from the one we describe here. In that, each block represents a single observational unit. For each factor, subsets of the levels are applied, rather than single levels. For example, a group of \(k_1\) people might be needed, all playing similar roles, or a hybrid variety of wheat might be bred from \(k_2\) pure lines. See also Bailey (1992). In this context, it is not problematic to have \(\lambda _{ii}=0\) (so that \(k_i=1\)) for either \(i=1\) or \(i=2\).

*k*, from Yates (1933), Fisher (1935, 1942) and Bose (1947) onwards, combinations of factor levels do not occur more than once in any block: thus \(k_1=k_2=k\). Moreover, the subsets of combinations allocated to blocks are chosen depending on various assumptions about main effects and interactions. For example, if \(v_1=v_2=3\) and there are six blocks of three plots each then the design in Fig. 4 permits estimation of both main effects with full efficiency and all interaction contrasts with efficiency factor 1 / 2.

The dual form of this design is shown in Fig. 5. The positions of the block names show clearly how the block design was constructed from a pair of mutually orthogonal Latin squares. However, unlike in Fig. 2, no block name occurs more than once in any row or column. Consequently, the occurrences of each block name do not have the rectangular layout that they do in Fig. 2.

*k*is the block size. For example, Preece (1966b) gave the design in Fig. 6. The dual form is in Fig. 7.

Many authors required the block design for each treatment factor separately to be balanced. This is the analogue of conditions (a)–(d) when \(k_1=k_2\). From Agrawal (1966) and Preece (1966a) onwards, another condition was often imposed, eventually called *adjusted orthogonality* by Eccleston and Russell (1977): the product \({\tilde{N}}_1{\tilde{N}}_2^\top \) should have all its entries equal, where \({\tilde{N}}_1\) and \({\tilde{N}}_2\) are the \(v_1\times b\) and \(v_2\times b\) incidence matrices for the first and second treatment factors, respectively, in blocks. Although this is a consequence of condition (e), it is not equivalent to it. The duals of designs satisfying these conditions were called *triple arrays* by McSorley et al. (2005). The design in Fig. 7 is a triple array.

The statistical relevance of adjusted orthogonality is discussed in Bailey (2017, Sects. 7–8). If all medical centres recruit the same number of patients then, under condition (b) and the first version of the proposal, condition (d) gives a design which is optimal for the estimation of drug effects in the model which excludes the effects of cancer types. These estimates are obtained by adjusting for block effects. Adjusted orthogonality implies that, under the additive model for all three effects, once the responses have been adjusted for block effects then drugs are orthogonal to cancer types and so no further adjustment is needed. Hence conditions (d) and (e) give a design optimal for the estimation of drug effects under condition (b). Likewise, conditions (c) and (e) give a design optimal for the estimation of cancer effects under condtion (a).

In spite of the similar conditions that they satisfy, triple arrays are not special cases of 2-part 2-designs, nor vice versa. In a triple array, no block name occurs more than once in any row or column. In the dual form of a 2-part 2-design, any block name that occurs in a given row must occur \(k_2\) times in that row. A consequence of the “non-zero” part of condition (d) is that \(k_2>1\).

Apart from the designs given by Preece et al. (2005), infinite families of triple arrays have proved frustratingly hard to find: see Bailey (2017, Sect. 13). By contrast, in Sects. 2 and 4 of this paper we give many simple constructions of 2-part 2-designs and their generalizations.

### 1.5 Conditions on parameters

An ordinary block design is said to be \(\alpha \)-resolved if its set of blocks can be partitioned into classes in such a way that each treatment occurs \(\alpha \) times in each class. This terminology does not extend easily to 2-part 2-designs, because cancer types may occur in different numbers of blocks from drugs. We propose calling a 2-part block design *c*-*partitionable* if the set of blocks can be grouped into *c* classes of *b* / *c* blocks each, in such a way that every cancer type occurs the same number of times in each class and every drug occurs the same number of times in each class. It is convenient to extend this terminology to ordinary block designs: such a design with replication *r* is \(\alpha \)-resolved if and only if it is *c*-partitionable, where \(\alpha c =r\).

### Theorem 1

*c*-partitionable then

### Proof

The first two statements are the usual conditions for the 2-designs on cancer types and drugs respectively, while Eq. (3) equates two different ways of counting the number of choices of a cancer type, a drug, and a block containing both.

*I*and

*J*are identity and all-1 matrices of the appropriate sizes.

*c*respectively whose entries sum to 0. Then

*c*respectively. The action of \(NN^\top \) on this space is obtained by replacing the block matrices by their row sums: using the results in (1)–(3), this simplifies to

The first part of the theorem shows that every 2-part 2-design is 1-partitionable. Thus inequality (4) is a special case of inequality (5). \(\square \)

### Remark 1

Mukerjee (1998) remarked on the integrality conditions (1)–(3) without stating them explicitly, and proved inequality (4).

### Remark 2

Inequality (4) can be regarded as a generalization of both Fisher’s and Bose’s inequalities: see Cameron and van Lint (1991, Chap. 1) and Bailey (2008, Chap. 11). For Fisher’s inequality, take the C-design to be any 2-design with \(v=v_1\), and take a single drug which occurs in all blocks; we have \(\lambda _{12}=r_1\) and \(\lambda _{22}=0\): although our conditions that \(\lambda _{22}>0\) and \(k_2<v_2\) fail for the D-design, the proof still works, because the only vector \(\mathbf {w}_2\) in Eq. (6) is the zero vector: thus the proof gives \(b\ge v+1-1\). For Bose’s inequality, take the C-design to be any resolvable 2-design with \(v=v_1\) and replication \(r=r_1\), and the drugs to be labelled by the resolution classes of the design, with a drug in every block in the corresponding resolution class, so that \(v_2=r\). We have \(\lambda _{12}=1\) and \(\lambda _{22}=0\). Again part of condition (d) fails, but the proof works, giving \(b\ge v+r-1\). Inequality (5) seems to be the true analogue of Bose’s inequality for 2-part 2-designs.

## 2 Constructions of 2-part 2-designs

In this section, we give several constructions. In order to identify when two different constructions give designs which are essentially the same, we say that two 2-part 2-designs are *isomorphic* to each other if one can be obtained from the other by relabelling some of blocks, cancer types and drugs. Weak isomorphism generalizes this by also allowing the roles of cancer types and drugs to be interchanged.

Given two or more non-isomorphic designs for the same parameters, there may be practical reasons for preferring one over the rest.

*C-swap*creates a new 2-part 2-design. This simply involves replacing the set of cancer types in each block with the complementary set. This changes the parameters \(k_1\), \(\lambda _{11}\) and \(\lambda _{12}\) to \(v_1-k_1\), \(b-2r_1+\lambda _{11}\) and \(r_2-\lambda _{12}\), leaving

*b*, \(v_1\), \(v_2\), \(k_2\) and \(\lambda _{22}\) unchanged. The new design fulfills all the conditions so long as \(v_1-k_1\ge 2\). The combination of a C-swap and the analogous D-swap has the effect of replacing each block by its complement (in the zipped form).

### Construction 1

(Cartesian products) One obvious method of construction is the cartesian product. This starts with two balanced incomplete-block designs, one for \(v_1\) treatments in \(b_1\) blocks of size \(k_1\), the other for \(v_2\) treatments in \(b_2\) blocks of size \(k_2\). Form all \(b_1b_2\) combinations of a block of each sort. For each combination, form the cartesian product of their subsets of treatments.

This will usually result in rather large values of *b*. For example, when \(v_1=6\), \(k_1=3\), \(v_2=5\) and \(k_2=2\) then the smallest possible values of \(b_1\) and \(b_2\) are both 10, so this construction gives a design with 100 blocks, unlike the design with 10 blocks in Fig. 1.

Parameter sets for the designs with the least number of blocks that can be made by Construction 1: \(v_1\) is the number of cancer types, \(v_2\) is the number of drugs, and *b* is the number of blocks, each of which has \(k_1\) cancer types and \(k_2\) drugs

| \(v_1\) | \(v_2\) | \(k_1\) | \(k_2\) |
---|---|---|---|---|

9 | 3 | 3 | 2 | 2 |

12 | 4 | 3 | 3 | 2 |

15 | 5 | 3 | 4 | 2 |

16 | 4 | 4 | 3 | 3 |

18 | 4 | 3 | 2 | 2 |

18 | 6 | 3 | 5 | 2 |

20 | 5 | 4 | 4 | 3 |

21 | 7 | 3 | 3 | 2 |

21 | 7 | 3 | 6 | 2 |

24 | 4 | 4 | 3 | 2 |

24 | 6 | 4 | 5 | 3 |

24 | 8 | 3 | 7 | 2 |

25 | 5 | 5 | 4 | 4 |

27 | 9 | 3 | 8 | 2 |

28 | 7 | 4 | 3 | 3 |

28 | 7 | 4 | 6 | 3 |

30 | 5 | 3 | 2 | 2 |

30 | 5 | 4 | 4 | 2 |

30 | 6 | 3 | 3 | 2 |

30 | 6 | 5 | 5 | 4 |

32 | 8 | 4 | 7 | 3 |

33 | 11 | 3 | 5 | 2 |

33 | 11 | 3 | 10 | 2 |

35 | 7 | 5 | 3 | 4 |

35 | 7 | 5 | 6 | 4 |

### Construction 2

(Subcartesian products) If \(k_2\) divides \(v_2\) then there may exist a resolved 2-design \(\varDelta _2\) for \(v_2\) drugs in \(b_2\) blocks of size \(k_2\) with *r* resolution classes. Suppose that \(\varDelta _1\) is a 2-design for \(v_1\) cancer types in \(b_1\) blocks of size \(k_1\), where \(b_1\) is a multiple of *r*. Now we can achieve a 2-part 2-design without taking the full product. Partition the blocks of \(\varDelta _1\) into *r* classes of size \(b_1/r\) in any way at all, and match these classes to the resolution classes of \(\varDelta _2\) in any way. For each matched pair, construct the cartesian product design. Putting these products together gives a design of the required type with \(b_1b_2/r\) blocks, considerably fewer than the \(b_1b_2\) blocks in the entire product of \(\varDelta _1\) and \(\varDelta _2\).

More generally, if the design \(\varDelta _2\) is *c*-partitionable and *c* divides \(b_1\) then replace the resolution classes in this construction by the *c* classes of blocks. This gives a 2-part 2-design with \(b_1b_2/c\) blocks. Putting \(c=1\) gives Construction 1 as a special case of this.

Table 2 shows some parameter sets for designs that can be made by Construction 2, possibly after an interchange or a swap, with \(k_i\le 10\) for \(i=1\) and \(i=2\). See the database in DesignTheory.org (2012) for the resolved designs used.

There are two special cases. When \(b_1=r\) then we simply match the blocks of \(\varDelta _1\) to the resolution classes of \(\varDelta _2\). When \(v_1=3\), \(k_1=2\), \(v_2=4\), \(k_2=2\) and \(r=3\), this gives the design in Fig. 10. When \(v_1=v_2=6\), \(k_1=k_2=3\) and \(r=10\), this gives the design in Fig. 11. When \(v_1=7\), \(k_1=3\), \(v_2=15\), \(k_2=3\) and \(r=7\), this gives a 2-part 2-design with \(b=35\), \(r_1=15\), \(r_2=7\), \(\lambda _{11}=5\), \(\lambda _{22}=1\) and \(\lambda _{12}=3\).

*r*then we may match the resolution classes of the two designs. For example, when \(v_1=v_2=4\) and \(k_1=k_2=2\) then we may take \(r=3\) and \(b_1=b_2=6\) to get the design in Fig. 8. This is not even weakly isomorphic to the design in Fig. 9, where the pairs of blocks from \(\varDelta _1\) do not form resolution classes. When \(v_1/k_1 = v_2/k_2=2\) and \(b_1=b_2\), Construction 3 also gives designs with these parameters.

Parameter sets for the designs with the least number of blocks with \(k_1\le 10\) and \(k_2\le 10\) that can be made by Constructions 2 or 3 but not 1: \(v_1\) is the number of cancer types, \(v_2\) is the number of drugs, and *b* is the number of blocks, each of which has \(k_1\) cancer types and \(k_2\) drugs; *r* is a number used in Construction 2

| \(v_1\) | \(v_2\) | \(k_1\) | \(k_2\) | | |
---|---|---|---|---|---|---|

6 | 4 | 3 | 2 | 2 | 3 | |

12 | 4 | 4 | 2 | 2 | 3 | |

12 | 6 | 4 | 5 | 2 | 3 | |

12 | 9 | 4 | 3 | 3 | 4 | |

14 | 8 | 7 | 4 | 3 | 7 | |

14 | 8 | 7 | 4 | 6 | 7 | |

15 | 6 | 5 | 2 | 4 | 5 | |

18 | 9 | 4 | 8 | 2 | 3 | |

20 | 6 | 5 | 3 | 2 | 10 | |

20 | 6 | 6 | 3 | 3 | 10 | |

20 | 10 | 6 | 9 | 3 | 10 | |

20 | 16 | 5 | 4 | 4 | 5 | |

22 | 12 | 11 | 6 | 5 | 11 | |

22 | 12 | 11 | 6 | 10 | 11 | |

24 | 9 | 4 | 3 | 2 | 3 | |

24 | 9 | 8 | 3 | 7 | 4 | |

28 | 8 | 7 | 2 | 3 | 7 | |

28 | 8 | 7 | 2 | 6 | 7 | |

28 | 8 | 8 | 4 | 4 | 7 | |

30 | 6 | 4 | 2 | 2 | 3 | |

30 | 6 | 5 | 2 | 2 | 5 | |

30 | 6 | 6 | 3 | 2 | 5 | |

30 | 10 | 4 | 4 | 2 | 3 | |

30 | 10 | 6 | 9 | 2 | 5 | |

30 | 15 | 4 | 7 | 2 | 3 | |

30 | 16 | 3 | 8 | 2 | 15 | |

30 | 16 | 5 | 8 | 4 | 15 | |

30 | 16 | 6 | 8 | 2 | 15 | |

30 | 16 | 10 | 8 | 4 | 15 | |

30 | 16 | 15 | 8 | 7 | 15 | |

30 | 25 | 3 | 5 | 2 | 6 | |

30 | 25 | 4 | 5 | 2 | 6 | |

30 | 25 | 6 | 5 | 5 | 6 | |

33 | 12 | 11 | 4 | 5 | 11 | |

33 | 12 | 11 | 4 | 10 | 11 | |

35 | 15 | 7 | 3 | 3 | 7 | |

35 | 15 | 7 | 3 | 6 | 7 |

At first sight, the two general constructions given by Mukerjee (1998) are special cases of this. His first construction needs both \(\varDelta _1\) and \(\varDelta _2\) to be *c*-partitionable, and matches the classes. This includes the cartesian product when \(c=1\), and when \(c=3\) it gives the design in Fig. 8 but not the one in Fig. 9. His second construction uses a *c*-partitionable design \(\varDelta _2\) only when \(b_1=c\). However, if *c* divides \(b_1\) then we may replace \(\varDelta _2\) by \(b_1/c\) copies of it, giving a \(b_1\)-partionable design whose classes can be matched to the blocks of \(\varDelta _1\).

Thus Construction 2 is precisely equivalent to the combination of the two in Mukerjee (1998).

Some 2-part 2-designs in which \(v_1=v_2\) and \(k_1=k_2 =v_1/2\) arise from Construction 2. Put \(n=k_1\). Suppose that \(\varDelta _0\) is a 2-design for 2*n* treatments in \(2r_0\) blocks of size *n*. If \(\varDelta _0\) is resolvable then we may put \(\varDelta _1=\varDelta _2=\varDelta _0\) in Construction 2, and match the resolution classes of \(\varDelta _1\) and \(\varDelta _2\) to obtain an \(r_0\)-partitionable 2-part 2-design in \(4r_0\) blocks. Figure 8 gives an example with \(n=2\). If \(\varDelta _0\) is not resolvable, then let \(\varDelta _2\) be the design with \(4r_0\) blocks consisting of \(\varDelta _0\) and its complement. This is resolvable, with \(r=2r_0\). Put \(\varDelta _1=\varDelta _0\) and apply Construction 2, matching the blocks of \(\varDelta _1\) to the replicates of \(\varDelta _2\). Again, this gives a 2-part 2-design in \(4r_0\) blocks. However, this design is not \(r_0\)-partitionable, because its C-design is not resolvable. Figure 11 shows an example with \(n=3\).

There are sometimes be operational reasons for preferring resolvable designs. Moreover, they can be used as ingredients in Construction 9 in Sect. 4 to give designs without too many blocks. The next construction always give resolvable designs for such parameters.

### Construction 3

(Hadamard matrices) Start with a Hadamard matrix *H* of order 4*n* in which the elements in the first row are all \(+1\). Identify the 2*n* cancer types with the columns in which the second row has entry \(+1\), and identify the 2*n* drugs with the columns in which the second row has entry \(-1\). Each of the remaining rows gives two blocks, one containing all the objects whose columns have entries \(+1\), and one containing all the objects whose columns have entries \(-1\). Thus \(b=8n-4\). Moreover, each pair of blocks contains each cancer type and each drug just once, in the concise representation, so the 2-part 2-design is \((4n-2)\)-partitionable and the lower bound in inequality (5) is achieved.

The asterisked entries in Table 2 show the parameters of the smallest designs that can be constructed by this method.

When \(n=4\) this construction gives the design in Fig. 8. For some values of *n*, different choices of Hadamard matrix, or different designations of which row is second, can give non-isomorphic designs. It may be that there are some values of *n* for which there exists a Hadamard matrix of order 4*n* but no 2-\((2n,n,n-1)\) design. If so, Construction 3 gives a design for these parameters but Construction 2 does not. Such a value of *n* is likely to be too large to affect designs of practical size.

### Construction 4

(Symmetric 2-designs) Here is another general method of construction. Consider a symmetric balanced incomplete-block design \(\varDelta \) for *v* treatments in *v* blocks of size *k*. Every pair of distinct treatments concur in \(\lambda \) blocks, where \(\lambda = k(k-1)/(v-1)\), and every pair of distinct blocks have \(\lambda \) treatments in common. Let \(\Gamma \) be one block of \(\varDelta \). Identify the treatments in \(\Gamma \) with *k* drugs \(D_1\), ..., \(D_k\) and the remaining treatments with \(v-k\) cancer types \(C_1\), ..., \(C_{v-k}\). Now consider the design \(\varDelta '\) consisting of all blocks of \(\varDelta \) except \(\Gamma \). Each of these blocks contains \(\lambda \) drugs and \(k-\lambda \) cancer types. In \(\varDelta '\), each pair of drugs concur in \(\lambda -1\) blocks; each pair of cancer types concur in \(\lambda \) blocks; and each drug occurs with each cancer type in \(\lambda \) blocks. Thus \(b = v-1\), \(v_1=v-k\), \(v_2=k\), \(k_1=k-\lambda \), \(k_2=\lambda \), \(\lambda _{11} = \lambda _{12}= \lambda \) and \(\lambda _{22} = \lambda -1\).

We can use Construction 4 whenever there exists a symmetric 2-\((v,k,\lambda )\) design with \(v=v_1+v_2\), \(k=v_2\) and \(\lambda =k_2\), provided that \(k_1+k_2=v_2\). In order to satisfy condition (d), \(\lambda \) must be bigger than one. The lower bound in inequality (4) is always met.

The properties of symmetric 2-designs guarantee that conditions (c) and (d) hold, but they also match up the blocks of the C-design and the D-design, which typically produces fewer blocks than previous construction methods.

The design in Fig. 1 can be obtained by this construction with \(v=11\), \(k=5\) and \(\lambda =2\). Figure 10 gives the design with \(v=7\), \(k=4\) and \(\lambda =2\).

Parameter sets for which small designs can be made by Construction 4: \(v_1\) is the number of cancer types, \(v_2\) is the number of drugs, and *b* is the number of blocks, each of which has \(k_1\) cancer types and \(k_2\) drugs; *v*, *k* and \(\lambda \) are parameters of the symmetric 2-design used in the construction

| \(v_1\) | \(v_2\) | \(k_1\) | \(k_2\) | | | \(\lambda \) |
---|---|---|---|---|---|---|---|

6 | 4 | 3 | 2 | 2 | 7 | 4 | 2 |

10 | 6 | 5 | 3 | 2 | 11 | 5 | 2 |

12 | 9 | 4 | 6 | 3 | 13 | 9 | 6 |

14 | 8 | 7 | 4 | 3 | 15 | 7 | 3 |

15 | 10 | 6 | 4 | 2 | 16 | 6 | 2 |

18 | 10 | 9 | 5 | 4 | 19 | 9 | 4 |

22 | 12 | 11 | 6 | 5 | 23 | 11 | 5 |

24 | 16 | 9 | 6 | 3 | 25 | 9 | 3 |

### Construction 5

(Augmentation) Given a 2-part 2-design \(\varDelta \) in which \(v_2=2k_2+1\), we may augment it to one for one more drug by increasing \(v_2\) to \(v_2+1\) and \(k_2\) to \(k_2+1\) while merely doubling the number of blocks. Replace each block of \(\varDelta \) by two blocks, both with the same set of cancer types as before. One of these blocks has the previous set of drugs and the extra drug, while the other has all the remaining drugs.

For example, augmenting the design in Fig. 1 gives the design in Fig. 11.

Applying the augmentation just to the D-design gives a resolvable 3-design, as shown in the Extension Theorem of Alltop (1972). This can be used directly in Construction 2. However, augmentation is such a straightforward way of obtaining one 2-part 2-design from another that we think it is worth identifying.

### Construction 6

(Group divisible designs) If \(v_1=v_2=v\) and \(k_1=k_2=k\) then the zipped form of a 2-part 2-design is a semi-regular group-divisible incomplete-block design for two so-called groups of *v* treatments in blocks of size 2*k* with \(k>1\): see Bose and Connor (1952). Unzipping any one of these gives a 2-part 2-design.

Table VII of Clatworthy (1973) gives three such designs. Unzipping them gives the product design for the first parameter set in Table 1, the design in Fig. 8 and the design in the first three columns of Fig. 12.

### Construction 7

(Group actions) Here is a construction based on group actions. Suppose that the group *G* acts 2-transitively on two sets *C* and *D* of sizes \(v_1\) and \(v_2\) respectively, and that *G* is also transitive in the induced action on \(C\times D\). Choose a subset of *C* and a subset of *D*, each containing at least two points; their union is a block, and the images of this block under *G* give the remaining blocks. The blocks have to be unzipped to give a 2-part 2-design. This does not give much control over *b*, except that we know it is a divisor of the order of *G*. A strategy for finding good designs by this method is to choose a subgroup *H* of *G* which acts intransitively on each of *C* and *D*, and to use fixed sets of *H* on *C* and *D* in the construction.

The three examples below arise from this construction, but can be more easily be derived from the 3-(22, 6, 1) design \(\varXi \) whose automorphism group is the Mathieu group \(M_{22}\). It has 22 points and 77 blocks of size 6, any two blocks meeting in zero or two points; see Cameron & van Lint (1991, Chaps. 1 and 9). For simplicity, we describe the cancer types as red points and the drugs as green points.

Take a block \(B_0\) of the design \(\varXi \); its points are red, and the remaining 16 points are green. For each of the 60 blocks meeting \(B_0\) in two points, we define a block of our new design containing two red and four green points. Now two red points lie in five blocks, one of which is \(B_0\); so they lie in four more blocks. A red and green point lie in five blocks, each containing two red points. Two green points lie in three blocks meeting \(B_0\). For each point of \(B_0\) lies in a unique such block, and each block contains two points of \(B_0\). So we have an example with \(v_1=6\), \(v_2=16\), \(b=60\), \(k_1=2\), \(k_2=4\), and \((\lambda _{11},\lambda _{12},\lambda _{22})=(4,5,3)\).

The other two examples use the 4-(23, 7, 1) design \(\varTheta \) in which the blocks through the extra point are formed by adjoining that point to the blocks of \(\varXi \): see Cameron and van Lint (1991). The counting arguments that verify their properties are similar to what we have just seen.

For the second design, we take a set *A* of seven points which form a block of \(\varTheta \) not containing the extra point. These will be red, and the remaining 15 points of \(\varXi \) green. Any block of \(\varXi \) meets *A* in one or three points; we take the blocks meeting *A* in three points to be the blocks of the required design. We obtain an example with \(v_1=7\), \(v_2=15\), \(b=35\), \(k_1=k_2=3\), and \((\lambda _{11},\lambda _{12},\lambda _{22})=(5,3,1)\).

This has the same parameters as the fifth design made using Construction 2.

Finally, using the 23-point design \(\varTheta \) but not throwing away the extra point we obtain a design with \(v_1=7\), \(v_2=16\), \(b=140\), \(k_1=3\), \(k_2=4\), and \((\lambda _{11},\lambda _{12},\lambda _{22})=(20,15,7)\). Another design with these parameters is the cartesian product of the projective plane of order 2 and the affine plane of order 4; these designs are not isomorphic.

To build these from the group action construction, the relevant groups are the stabilizers of the sets of six or seven red points in the appropriate Mathieu groups; these are the groups \(2^4:S_6\), \(A_7\), and \(2^4:A_7\) respectively.

## 3 Generalizing the design problem

### 3.1 The extended problem

In March 2016 Valerii Fedorov extended the problem as follows. Can we add a third factor, whose levels are biomarkers in this case, subject to the obvious extra conditions? Here we generalize this to an arbitrary number *m* of factors.

The conditions for an *m*-part 2-design are as follows. The analogue of conditions (a)–(b) is that, for \(1\le i \le m\), factor *i* has \(v_i\) levels and each medical centre involves \(k_i\) of them, where \(k_i<v_i\); the analogue of conditions (c)–(d) is that, for \(1\le i\le m\), each pair of levels of factor *i* are used together at the same non-zero number \(\lambda _{ii}\) of medical centres.

The generalization of condition (e) is less clear. When \(i=3\), a weak generalization is that each biomarker is used on each cancer type at the same number \(\lambda _{13}\) of medical centres and that each biomarker is used with each drug at the same number \(\lambda _{23}\) of medical centres. For now, we use this weak version. Note, however, that this gets us into the territory of factorial design, so we might be confounding all or part of a two-factor interaction with all or part of a main effect. By analogy with orthogonal arrays (Hedayat et al. 1999), we call this weak generalization a *3-part 2-design with strength 2*, whereas a 3-part 2-design with strength 3 would have every triple of (cancer type, drug, biomarker) at the same number \(\lambda _{123}\) of medical centres.

Thus the strength-2 generalization of condition (e) is that, for \(1 \le i <j \le m\), each level of factor *i* occurs with each level of factor *j* at the same number \(\lambda _{ij}\) of medical centres.

### 3.2 Conditions on parameters in the extended problem

The definition of *c*-partitionable extends to *m*-part 2-designs in the obvious way.

### Theorem 2

*m*-part 2-design of strength 2, all of the following are satisfied.

Proofs are similar to those in Sect. 1.5. Part (iv) is precisely Theorem 1 of Mukerjee (1998).

## 4 Constructions of *m*-part 2-designs of strength at least 2

### 4.1 Two main constructions

Here we give the main construction of Mukerjee (1998) in the language of this paper.

### Construction 8

(Orthogonal arrays) Suppose that there is a positive integer *c* such that, for \(i=1\), ..., *m*, \(\varDelta _i\) is a *c*-partitionable 2-design for \(v_i\) treatments in \(b_i\) blocks of size \(k_i\). Moreover, there is an orthogonal array \(\Gamma \) with *m* columns, where column *i* contains \(b_i/c\) symbols for \(1\le i \le m\).

Match the *c* classes of blocks of \(\varDelta _1\), ..., \(\varDelta _m\). For \(j=1\), ..., *c* separately, each row \(\rho \) of \(\Gamma \) gives a block of the new design, as follows. For \(i=1\), ..., *m*, identify the block in class *j* of \(\varDelta _i\) labelled by the symbol in row \(\rho \) and column *i* of \(\Gamma \): then form the cartesian product of these *m* blocks. This gives a *c*-partitionable *m*-part 2-design in *sc* blocks, where *s* is the number of rows of \(\Gamma \). The strength of this new design is equal to the strength of the orthogonal array \(\Gamma \).

In one extreme case, \(\Gamma \) has all possible different rows, so that \(s=\left( \prod _{i=1}^m b_i\right) /c^m\). If, in addition, \(c=1\), then \(s=\prod _{i=1}^m b_i\) and we obtain the full cartesian product.

The design in Fig. 13 can be made in this way with \(c=1\), using an orthogonal array with three columns, each with three symbols.

*j*-th replicates from the three designs, not by the full cartesian product, which would give eight blocks, but by using an orthogonal array of strength 2 with four rows and three columns, each with two symbols. This 3-part 2-design has strength 2 but not strength 3.

Table 1 of Sitter (1993) gives a 7-part 2-design made in this way with \(b=24\) and \(v_i=2k_i=4\) for \(i=1\), ..., 7.

### Construction 9

(Products of multi-part designs) The ingredients of the previous construction are *m* individual 2-designs and an orthogonal array, which may be trivial. Instead, we may start with multi-part 2-designs, or an assortment of 2-designs and multi-part 2-designs. The use of orthogonal arrays and/or *c*-partitioning can be extended to this method too. As in Construction 2, we can allow one of the constituent designs to be not *c*-partitionable, so long as its number *b* of blocks is divisible by *c*.

The full product of an \(m_1\)-part 2-design \(\varTheta _1\) with \(b_1\) blocks and an \(m_2\)-part 2-design \(\varTheta _2\) with \(b_2\) blocks is an \((m_1+m_2)\)-part 2-design with \(b_1b_2\) blocks and strength 2. If \(\varTheta _1\) has strength \(m_1\) and \(m_2=1\) then the full product has strength \(m_1+1\). For example, if \(m=3\), \(v_3=3\) and \(k_3=2\) then the product of the design in Fig. 1 and a 2-design with three blocks of size 2 gives a 3-part 2-design with 30 blocks and strength 3.

*c*-partionable condition, suppose that \(\varTheta \) is a

*c*-partitionable 2-part 2-design for drugs and cancer types and \(\varDelta \) is a 2-design for \(v_3\) biomarkers in

*c*blocks of size \(k_3\). We can simply match the blocks of \(\varDelta \) to the classes of \(\varTheta \) in any way. The 3-part 2-design in Fig. 12 was made like this by starting with a 2-part 2-design made by Construction 3, grouping blocks into ten classes of the form \(\{2i-1,2i\}\), and matching these classes to the ten blocks of a 2-design \(\varDelta \) for five biomarkers. Similarly, if \(v_1=v_2=v_3=6\) and \(k_1=k_2=k_3=3\) we can obtain a 3-part 2-design in 20 blocks by matching the ten blocks of a 2-(6, 3, 2) design to the ten classes in the left-hand side of Fig. 12.

The special case of the last part of this construction with \(b=c\) is the second general construction given by Mukerjee (1998). As noted in Sect. 2, this specialization does not restrict his designs. However, because we have now given more constructions for the case that \(m=2\), applying the various product constructions to them produces new designs for higher values of *m* also.

### 4.2 Other constructions

The augmentation method in Construction 5 easily generalizes to three or more factors. If \(v_i =2k_i +1\) then \(v_i\) and \(k_i\) can be increased by one while the number of blocks is merely doubled.

If \(v_1= \cdots =v_m = v\) and \(k_1= \cdots =k_m=k\) then the zipped form of an *m*-part 2-design is a semi-regular group-divisible design for *mv* treatments in blocks of size *mk* with \(k>1\). Just as in Construction 6, any such design can be unzipped to give a *m*-part 2-design. There are two such designs with \(m=3\) in Table VII of Clatworthy (1973). Their unzipped forms are the designs in Figs. 13 and 14. The one with \(m=4\) gives the design in Fig. 15, which can also be obtained from a \(9\times 4\) orthogonal array with three symbols in each column.

The group method in Construction 7 also easily extends to three or more factors: simply take a permutation group with more than two 2-transitive actions.

## Notes

### Acknowledgements

We thank Valerii Fedorov for posing this interesting problem.

## References

- Agrawal HL (1966) Some methods of construction of designs for two-way elimination of heterogeneity—1. J Am Stat Assoc 61:1153–1171MathSciNetGoogle Scholar
- Alltop WO (1972) An infinite class of 5-designs. J Comb Theory (A) 12:390–395MathSciNetCrossRefzbMATHGoogle Scholar
- Anthony MHG, Martin KM, Seberry J, Wild P (1990) Some remarks on authentication systems. In: Seberry J, Pieprzyk J (eds) Advances in cryptology, Auscrypt ’90. Volume 453 of Lecture Notes in Computer Science. Springer, New York, pp 122–139Google Scholar
- Bagchi S (1998) On two-way designs. Gr Comb 14:313–319MathSciNetzbMATHGoogle Scholar
- Bailey RA (1992) Efficient semi-Latin squares. Stat Sin 2:413–437MathSciNetzbMATHGoogle Scholar
- Bailey RA (2008) Design of comparative experiments. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
- Bailey RA (2011) Symmetric factorial designs in blocks. J Stat Theory Pract 5:13–24MathSciNetCrossRefzbMATHGoogle Scholar
- Bailey RA (2017) Relations among partitions. In: Claesson A, Dukes M, Kitaev S, Manlove D, Meeks K (eds) Surveys in combinatorics 2017. Volume 400 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, pp 1–86Google Scholar
- Bailey RA, Cameron PJ (2009) Combinatorics of optimal designs. In: Huczynska S, Mitchell JD, Roney-Dougal CM (eds) Surveys in combinatorics 2009. Volume 365 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, pp 19–73Google Scholar
- Bose RC (1947) Mathematical theory of the symmetrical factorial design. Sankhyā 8:107–166MathSciNetzbMATHGoogle Scholar
- Bose RC, Connor WS (1952) Combinatorial properties of group divisible incomplete block designs. Ann Math Stat 23:367–383MathSciNetCrossRefzbMATHGoogle Scholar
- Brickell EF (1984) A few results in message authentication. Congr Numer 43:141–154MathSciNetzbMATHGoogle Scholar
- Caliński T, Kageyama S (2000) Block designs: a randomization approach. Volume I: Analysis. Volume 150 of Lecture Notes in Statistics. Springer, New YorkGoogle Scholar
- Cameron PJ, van Lint JH (1991) Designs, graphs, codes and their links. Volume 22 of London Mathematical Society Student Texts. Cambridge University Press, CambridgeGoogle Scholar
- Clatworthy WH (1973) Tables of two-associate-class partially balanced designs. Volume 63 of Applied Mathematics Series. National Bureau of Standards, Washington, DCGoogle Scholar
- Derhaschung U, Gilbert J, Jäger U, Böhmig G, Singl G, Jilma B (2016) Combined integrated protocol/basket trial design for a first-in-human trial. Orphanet J Rare Dis 11:134. https://doi.org/10.1186/s13023-016-0494-z CrossRefGoogle Scholar
- DesignTheory.org (2012) http://designtheory.org/
- Eccleston JA, Russell KG (1977) Adjusted orthogonality in nonorthogonal designs. Biometrika 64:339–345MathSciNetCrossRefzbMATHGoogle Scholar
- Fedorov VV, Leonov SL (2019) Combinatorial and model-based methods in structuring and optimizing cluster trials. In: Beckman RA, Antonijevic Z (eds) Platform trials in drug development: umbrella trials and basket trials. Chapman & Hall/CRC Press, Boca Raton, pp 265–286Google Scholar
- Fisher RA (1935) The design of experiments. Oliver & Boyd, EdinburghGoogle Scholar
- Fisher RA (1942) The theory of confounding in factorial experiments in relation to the theory of groups. Ann Eugen 11:341–353MathSciNetCrossRefGoogle Scholar
- Hall M Jr (1986) Combinatorial theory, 2nd edn. Wiley, New YorkzbMATHGoogle Scholar
- Hedayat AS, Sloane NJA, Stufken J (1999) Orthogonal arrays: theory and applications. Springer, New YorkCrossRefzbMATHGoogle Scholar
- Hoffman DG, Liatti M (1995) Bipartite designs. J Comb Des 3:449–454MathSciNetCrossRefzbMATHGoogle Scholar
- John JA, Williams ER (1995) Cyclic and computer generated designs. Volume 38 of Monographs on Statistics and Applied Probability. Chapman & Hall, LondonGoogle Scholar
- Li M, Liang M, Du B (2015) A construction of \(t\)-fold perfect splitting authentication codes with equal deception probabilities. Cryptogr Commun 7:207–215MathSciNetCrossRefzbMATHGoogle Scholar
- Martin KM, Seberry J, Wild PR (1992) Resolvable designs applicable to cryptographic authentication schemes. J Comb Math Comb Comput 12:153–160MathSciNetzbMATHGoogle Scholar
- McSorley JP, Phillips NCK, Wallis WD, Yucas JL (2005) Double arrays, triple arrays and balanced grids. Des Codes Cryptogr 35:21–45MathSciNetCrossRefzbMATHGoogle Scholar
- Mukerjee R (1998) On balanced orthogonal multi-arrays: existence, construction and application to design of experiments. J Stat Plan Inference 73:149–162MathSciNetCrossRefzbMATHGoogle Scholar
- Phillips NCK, Wallis WD (1996) All solutions to a tournament problem. Congr Numer 114:193–196MathSciNetzbMATHGoogle Scholar
- Preece DA (1966a) Some balanced incomplete block designs for two sets of treatments. Biometrika 53:479–486MathSciNetCrossRefGoogle Scholar
- Preece DA (1966b) Some row and column designs for two sets of treatments. Biometrics 22:1–25MathSciNetCrossRefGoogle Scholar
- Preece DA, Wallis WD, Yucas JL (2005) Paley triple arrays. Aust J Comb 33:237–246MathSciNetzbMATHGoogle Scholar
- Sitter RR (1993) Balanced repeated replications based on orthogonal multi-arrays. Biometrika 80:211–221MathSciNetCrossRefzbMATHGoogle Scholar
- Soicher LH (1999) On the structure and classification of SOMAs: generalizations of mutually orthogonal Latin squares. Electron J Comb 6:R32MathSciNetzbMATHGoogle Scholar
- Soicher LH (2013) Optimal and efficient semi-Latin squares. J Stat Plan Inference 143:573–582MathSciNetCrossRefzbMATHGoogle Scholar
- Woodcock J, LaVange LM (2017) Master protocols to study multiple therapies, multiple diseases, or both. N Engl J Med 377:62–70CrossRefGoogle Scholar
- Yates F (1933) The principles of orthogonality and confounding in replicated experiments. J Agric Sci 23:108–145CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.