Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

Tanaka, Tsubasa; Fujii, Koichi

doi:10.1007/978-3-319-47337-6_29

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

Tsubasa Tanaka⁷ &
Koichi Fujii⁸

Chapter
First Online: 13 October 2017

1243 Accesses

Part of the book series: Computational Music Science ((CMS))

Abstract

In polyphonic music, melodic patterns (motifs) are frequently imitated or repeated, and transformed versions of motifs such as inversion, retrograde, augmentations, diminutions often appear. Assuming that economical efficiency of reusing motifs is a fundamental principle of polyphonic music, we propose a new method of analyzing a polyphonic piece that economically divides it into a small number of types of motif. To realize this, we take an integer programming-based approach and formalize this problem as a set partitioning problem, a well-known optimization problem. This analysis is helpful for understanding the roles of motifs and the global structure of a polyphonic piece.

Download chapter PDF

1 Motif Division

In polyphonic music like fugue-style pieces or J.S. Bach’s Inventions and Sinfonias, melodic patterns (motifs) are frequently imitated or repeated. Although some motifs are easy to find, others are not. This is because they often appear implicitly and/or appear in the transformed versions such as inversion, retrograde, augmentations, diminutions. Therefore, motif analysis is useful to understand how polyphonic music is composed.

Simply speaking, we can consider the motifs that appear in a musical piece to be economical if the number of types of motif is small, the numbers of repetitions are large, and the lengths of the motifs are long. Assuming that this economical efficiency of motifs is a fundamental principle of polyphonic music, we propose a new method of analyzing a polyphonic piece that efficiently divides it into a small number of types of motif. Using this division, the whole piece is reconstructed with a small number of types of motif like the puzzle game Tetris [1] (In tetris, certain domains are divided with only seven types of piece). We call such a segmentation a motif division.

If a motif division is accomplished, it provide us a simple and higher-level representation whose atom is a motif, not a note, and it will be helpful to clarify the structures of polyphonic music. The representation may provide knowledge about how frequent and where each motif is used, the relationships between motifs such as causality and co-occurrence, which transformations are used, how the musical form is constructed by motifs, and how the long-term musical expectations are formed. This analysis may be useful for applications such as systems of music analysis, performance, and composition.

Studies about finding boundaries of melodic phrases are often based on human cognition. For example, [2] is based on grouping principles of gestalt psychology, and [3] is based on a short-term memory model. While these studies deal with relatively short range of perception and require small amounts of computational time, we focus on global configuration of motifs on the level of compositional planning. This requires us to solve an optimization problem that is hard to solve. To deal with this difficulty, we take an integer programming-based approach [4] and show that this problem can be formalize as a set partitioning problem [5]. This problem can be solved by integer programming solvers that use efficient algorithms such as the branch and bound method.

2 Transformation Group and Equivalence Classes of Motif

In this section, we introduce equivalence classes of motif derived from a group of motif transformations as the criterion of identicalness of motifs. These equivalence classes are used to formulate the motif division in Sect. 3.

Firstly, a motif is defined as an ordered correction of notes $[N_1, N_2, \ldots , N_k]$ ($k>0$), where $N_i$ is the information for the ith note, comprising the combination of the pitch $p_i$, start position $s_i$, and end position $e_i$ ($N_i = (p_i, s_i, e_i)$, $s_i<e_i \le s_{i+1}$). Next, let $\mathcal {M}$ be the set of every possible motif, and let $T_p$, $S_t$, R, I, $A_r$ be one-to-one mappings (transformations) from $\mathcal {M}$ to $\mathcal {M}$, where $T_p$ is the transposition by pitch interval p, $S_t$ is the shift by time interval t ($p, t \in \mathbb {R}$), R is the retrograde, I is the inversion, and $A_r (r >0)$ is the r-fold argumentation (diminution, in the case of $0<r<1$). These transformations generate a transformation group $\mathcal {T}$ whose operation is the composition of two transformations and whose identity element is the transformation that does noting. Each transformation in $\mathcal {T}$ is a strict imitation that preserves the internal structures of the motifs.

Here, a binary relation between a motif m ($\in \mathcal {M}$) and $\tau (m)$ ($\tau \in \mathcal {T}$) can be defined. Due to the group structure of $\mathcal {T}$, this relation is an equivalence relation (i.e., it satisfies reflexivity, symmetry and transitivity [6]). Then, it derives equivalence classes in $\mathcal {M}$. Because the motifs that belong to a same equivalence class share the same internal structure, they can be regarded as identical (or the same type).^{Footnote 1}

3 Formulation as a Set Partitioning Problem

A set partitioning problem, which is well known in the context of operations research, is an optimization problem defined as follows. Let N be a set that consists of n elements $\{N_1,N_2, \ldots , N_n\}$, and let M be a family of sets $\{M_1,M_2, \ldots , M_m\}$, where each $M_j$ is a subset of N. If $\bigcup _{ j \in X} M_j=N$ is satisfied, X, a subset of indexes of M, is called a cover, and the cover X is called a partition if $M_{j_1} \bigcap M_{j_2} = \varnothing $ is satisfied for different $j_1, j_2 \in X$. If a constant $c_j$ called a cost is defined for each $M_j$, the problem of finding a partition X that minimizes the sum of the costs $\sum _{j \in X}c_j$ is called a set partitioning problem.

3.1 Condition of Motif Division

If $N_i$ corresponds to each note of a musical piece to be analyzed and $M_j$ corresponds to a motif, the problem of finding the most economically efficient motif division can be interpreted as a set partitioning problem. The index i starts from the first note of a voice to the last note of the voice and from the first voice to the last voice. $M_j (1\le j \le m)$ corresponds to $[N_1]$, $[N_1, N_2]$, $[N_1, N_2, N_3], \ldots , [N_2]$, $[N_2, N_3]$, $\ldots $ in this order. The number of notes in a motif is less than a certain limit number (Fig. 1).

This information can be represented by the following matrix A:

$$\begin{aligned} A=\left[ \begin{array}{cccccccccccccc} 1&{}1&{}1&{}1&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{} \cdots \\ 0&{}1&{}1&{}1&{}1&{}1&{}1&{}1&{}0&{}0&{}0&{}0&{}0&{} \cdots \\ 0&{}0&{}1&{}1&{}0&{}1&{}1&{}1&{}1&{}1&{}1&{}1&{}0&{} \cdots \\ 0&{}0&{}0&{}1&{}0&{}0&{}1&{}1&{}0&{}1&{}1&{}1&{}1&{} \cdots \\ 0&{}0&{}0&{}0&{}0&{}0&{}0&{}1&{}0&{}0&{}1&{}1&{}0&{} \cdots \\ 0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}1&{}0&{} \cdots \\ \vdots &{} \vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\ddots \end{array}\right] \end{aligned}$$

(1)

where each row corresponds to each note $N_{i}$ and each column corresponds to which notes are covered by each motif $M_j$. This matrix is the case where the maximum number of notes in a motif is 4.

Representing the element of A as $a_{ij}$, the condition that the whole piece is exactly divided by a set of selected motifs can be described by the following constraints, which mean that each note $N_i$ is covered by one of $M_j$ once and only once:

$$\begin{aligned} \forall i \in \{1,2, \ldots , n\}, \ \sum _{j = 1}^{m} a_{ij} x_{j} = 1, \end{aligned}$$

(2)

where $x_{j}$ is a 0-1 variable that represents whether or not $M_j$ is used in the motif division. These conditions are equivalent to the condition of partitioning.

3.2 Objective Function

The purpose of motif division is to find the most efficient solution from the many solutions that satisfy the condition of partitioning. Then, we must define efficiency of motif division. We can consider that the average length (the number of notes) of motifs used in the motif division is one of the simplest barometers that represent the efficiency of motif division. Also, the number of motifs and that of the types of motif used in motif division will be efficient if they are small.

In fact, the average length of motifs is inversely proportional to the number of motifs. Therefore, if the number of types of motif (denoted by P) is fixed, the number of motifs will be what we should minimize.

The number of motifs can be simply represented by $\sum _{j = 1}^{m} x_{j}$. This is the cost function $\sum _{j = 1}^{m} c_{j} x_{j}$ whose $c_j$ is 1 for each j. We adopt this cost function. However, in the next subsection, we introduce additional variables and constraints to fix the number P.

3.3 Controlling the Number of Equivalence Classes

Let C be the set of equivalence classes of motif, which is derived from M, which is the set of all possible motif classes that can be found in a piece (only the motif classes whose number of notes is less than a certain number is included in M). This means that M is derived from $\mathcal {M}$ by a restriction.

Let $y_k$ be a 0-1 variable that represents whether or not one of the members of $C_k$ appears in X (the set of selected motifs), where each element of C is denoted as $C_k (1\le k \le l)$. This means that statement “$y_k=1$ $\Leftrightarrow $ $\sum _{ j \in C_k} x_j>0$” must be satisfied. This statement can be represented by the following constraints that use $\sum _{ j \in C_k} x_j$, the number of selected motifs that belong to $C_k$:

$$\begin{aligned} \forall k \in \{1,2, \ldots , l\}, \ y_k \le \sum _{ j \in C_k} x_j \le Q y_k, \end{aligned}$$

(3)

where Q is a constant that is sufficiently large.

Then, the statement that the number of equivalence classes is P can be represented by the following constraint:

$$\begin{aligned} \sum _{k = 1}^{l} y_{k} = P. \end{aligned}$$

(4)

If P is small to a certain degree, the motif division will tend to be simple. However, if P is too small, covering whole piece with few motif classes will be difficult and one note motif will be used too many times. This will lead to a loss of the efficiency of motif division.

Therefore, we should find good balance between the smallness of the objective function and the smallness of P. Because knowing which number is adequate for P in advance is difficult, we will solve the optimization problems for respective P in a certain range. Then, we will determine an adequate number for P, observing the solutions for respective P.

4 Result

We analyzed J.S. Bach’s Invention No. 1 by solving the optimization problem described in the previous section. The maximum length of motif was set as 7. An IP solver Numerical Optimizer 16.1.0. and a branch and bound method was used for searching the solution. From the observation of solutions for various values for P, P was set as 13. It took less than one minute to obtain a solution for $P=13$.

Figure 2 shows the result of motif division. The slurs represent the motifs and the one-note motifs don’t have a slur. Figure 3 shows the representatives of 13 motif classes that are used in the motif division.

This result tells us many things. For example, 4th, 10th, and 11th motif classes in Fig. 3 are slightly different but can be regarded as the same motif, which corresponds to the subject of this piece. Searching for the domains where the subject doesn’t appear, we find that there are three domains whose durations are one and half bars (These are indicated by the big rectangles). The ends of these domains coincide with the places where the cadences exist. Therefore, we could detect three sections of this piece properly.

The last motif class in Fig. 3 is a leap of octave. This motif class appears in all of the cadence domains and is related to the ends of sections. It also co-occurs with 2nd motif class, which is a two-note motif, in the cadence domains. The 12th motif class is a very characteristic one that includes a doted note and a large leap. This motif class only appears before the cadence domains (the two motifs surrounded by the rounded rectangle). We can consider that this remarkable motif class plays an important role that tells listeners the end of the exposition of subject and the beginning of the cadence domain.

The 9th zigzag motif class and the motif classes that are one-way slow movements shown by the arrows in Fig. 2 only appear as the ascending form in the first “2” sections. In contrast, these motif classes appear only as the descending form in the final section. We interpret this contrast means that the ascending form creates a sense of continuation of the piece and the descending form creates a sense of conclusion. Thus, long-term musical expectations seems to be formed by the selections of transformation.

In such ways, motif division is useful to make us understand the roles of motifs and how global musical structures are formed.

5 Conclusion

In this paper, we formulated the problem of motif division, which decomposes polyphonic music into a small number of motif classes, as a set partitioning problem, and we obtained the solution using an IP solver. It was shown that the motif division provides useful information to understand the roles of motifs and how global musical structures are constructed from the motifs.

Future tasks include construction of a program that automatically analyzes global structures utilizing the obtained motifs and automatic composition of new pieces that use the same motifs as the original piece using the result of the analysis program. To create a criterion for determining adequate value of P automatically is also a remaining problem.

Notes

1.
Although the criterion for identical motifs defined here only deals with strict imitations, we can define the criterion in different ways to allow more flexible imitations, such as by (1) defining an equivalence relation from the equality of a shape type [7,8,9,10] and (2) defining a similarity measure and performing a clustering of motifs using methods such as k-medoid method [11] (the resulting clusters derive an equivalence relation). In any case, making equivalence classes from a certain equivalence relation is a versatile way to define the identicalness of the motifs.

References

http://tetris.com
Cambouropoulos, E.: The local boundary detection model (LBDM) and its application in the study of expressive timing. In: Proceedings of ICMC, pp. 290–293 (2001)
Google Scholar
Ferrand, M., Nelson, P., Wiggins, G.: Memory and melodic density: a model for melody segmentation. In: Proceedings if the XIV Colloquium on Musical Informatics, pp. 95–98 (2003)
Google Scholar
Nemhauser, G.L., Wolsey, L.A.: Integer and Combinatorial Optimization. Wiley, New York (1988)
Book MATH Google Scholar
Balas, E., Padberg, M.W.: Set partitioning: a survey. SIAM Rev. 18(4), 710–760 (1976)
Article MATH MathSciNet Google Scholar
Armstrong, M.A.: Groups and Symmetry. Springer, Berlin (2012)
MATH Google Scholar
Buteau, C., Mazzola, G.: From contour similarity to motivic topologies. Musicae Scientiae Fall 2000 4(2), 125–149 (2000)
Google Scholar
Mazzola, G., et al.: The Topos of Music: Geometric Logic of Concepts, Theory, and Performance. Birkhäuser, Basel (2002)
Book MATH Google Scholar
Buteau, C.: Topological motive spaces, and mappings of scores motivic evolution trees. In: Fripertinger, H., Reich, L. (eds.) Grazer Mathematische Berichte, Proceedings of the Colloqium on Mathematical Music Theory, pp. 27–54 (2005)
Google Scholar
Buteau, C.: Melodic clustering within topological spaces of schumann’s Träumerei, Proceedings of ICMC, pp. 104–110 (2006)
Google Scholar
Bishop, C.: Pattern Recognition and Machine Learning, pp. 423–430. Springer, New York (2006)
MATH Google Scholar

Download references

Acknowledgements

This work was supported by JSPS Postdoctoral Fellowships for Research Abroad.

Author information

Authors and Affiliations

IRCAM, Paris, France
Tsubasa Tanaka
NTT DATA Mathematical Systems Inc., Tokyo, Japan
Koichi Fujii

Authors

Tsubasa Tanaka
View author publications
You can also search for this author in PubMed Google Scholar
Koichi Fujii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tsubasa Tanaka .

Editor information

Editors and Affiliations

CENIDIM-INBA, Centro Nacional de las Artes (CENART), Coyoacán, Distrito Federal, Mexico
Gabriel Pareyon
División de Electrónica y Computación (CUCEI), Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
Silvia Pina-Romero
Universidad de Cañada, Teotitlan de Flores Magón, Oaxaca, Mexico
Octavio A. Agustín-Aquino
Departamento de Matemáticas, UNAM Facultad de Ciencias, Coyoacán, Distrito Federal, Mexico
Emilio Lluis-Puebla

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tanaka, T., Fujii, K. (2017). Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem. In: Pareyon, G., Pina-Romero, S., Agustín-Aquino, O., Lluis-Puebla, E. (eds) The Musical-Mathematical Mind. Computational Music Science. Springer, Cham. https://doi.org/10.1007/978-3-319-47337-6_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-47337-6_29
Published: 13 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47336-9
Online ISBN: 978-3-319-47337-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics