Abstract
Raman spectroscopy is a well established tool for the analysis of vibration spectra, which then allow for the determination of individual substances in a chemical sample, or for their phase transitions. In the timeresolvedRamansprectroscopy the vibration spectra of a chemical sample are recorded sequentially over a time interval, such that conclusions for intermediate products (transients) can be drawn within a chemical process. The observed datamatrix M from a Raman spectroscopy can be regarded as a matrix product of two unknown matrices W and H, where the first is representing the contribution of the spectra and the latter represents the chemical spectra. One approach for obtaining W and H is the nonnegative matrix factorization. We propose a novel approach, which does not need the commonly used separability assumption. The performance of this approach is shown on a real world chemical example.
Introduction
In Raman spectroscopy vibrational spectra can be detected. Analysis of those spectra provides comprehension about chemical and physical properties of molecular structures, which is important in different research areas in biology, medicine and industry [1,2,3]. Nowadays, Raman spectrometers are capable of generating spectral recordings down to the femto second time scale. Such timeresolved Raman spectroscopy allows—besides spectral recordings of stable substances—for monitoring of events like intra molecular rearrangements and chemical reactions [4]. We thereby obtain measured Raman spectra as a function of time, which depicts both main characteristics of an observed process: On the one hand, each measured spectrum is a fingerprint of compounds and therefore represents the intrinsic spectra of the individual species or molecular states involved in the reaction. On the other hand, the relative contributions of the involved spectra to each measured spectrum reflect the momentary composition of the sample at the corresponding time. Through the full series of generated spectra we hence draw conclusions about the kinetics of the underlying reaction process. Consequently, the central task about timeresolved Raman data analysis is deciphering the series of measured spectra with respect to the individual component spectra and their temporal evolution.
This article is organized as follows. In Sect. 2, we give an overview of NMF approaches and algorithms known so far. In particular we present the separable NMF method, which found application in the approach for spectral analysis in [5]. Our new NMF approach, as well as the algorithmic details of the corresponding computational method, are introduced in Sect. 3. In Sect. 4, we present numerical results of our novel method. On the one hand, we thereby discuss recovery results for synthetic measurement data with increasing interference of the component spectra and presence of measurement noise. On the other hand, we verify the influence of the single components of our adaptable objective function through recovery results for certain choices of weighting coefficients.
Nonnegative matrix factorization (NMF)
From a mathematical point of view the nonnegative measurement matrix M, which contains the discretized timeresolved Raman spectra, can be expressed as
where the columns of W represent the component spectra and H the course of the relative concentrations. A factorization of M into the two matrices W and H is, from the chemical point of view interesting: the matrix W gives us the substances being involved in the reaction and the matrix H allows inference on the speed of the reaction. Note, that this is not possible by considering only one row or column of the matrix M. Summing up, timeresolved Raman spectral data can be modeled as the product of two nonnegative matrices representing the single component spectra and the underlying reaction kinetics (Fig. 1).
Recovering these factorization matrices, only given the measured timeresolved spectra, requires nonnegative matrix factorization (NMF). In general, NMF is an utile tool for the analysis of highdimensional data and therefore a relevant topic in presentday research in many scientific fields [6,7,8]. Besides detecting a compressed representation, NMF delivers insights into structure and features of the given data by extracting easily interpretable factors.
The goal of nonegative matrix factorization (NMF) (see e.g. [8, 9] and the references therein) of a data matrix M as input, is to solve an optimization problem in order to find matrices W and H with nonnegative entries such that the product WH is the best possible approximation of our nonnegative input data matrix M. NMF is a linear dimension reduction technique for a nonnegative data set, which means that the corresponding matrix of data points is approximated by a linear combination of the columns of matrix W.
Mathematical background The columns of W form a basis for the column space of matrix M and the columns of matrix H are the weights to approximate the data points. The NMF problem is \(\mathcal NP\)hard [10], due to the nonnegative constraints on W and H. Moreover the solution of an NMF Problem is generally not unique. To see this, assume that \(W>0\), \(H>0\), and that there exists a matrix D such that \(WD>0\) and \(D^{1}H >0\) then \(M=(WD)(D^{1}H)\) which shows that the NMF is not unique.
In the absence of the positivity constraints, the problem could be solved efficiently by using methods such as truncated singular value decomposition (TSVD) [11]. One of the common approaches for solving the NMF problem is the alternating least squares approach [12, 13]. In this approach, one of the two matrices is fixed, for example H, and then we find the corresponding optimal solution for W, which is a convex optimization problem with nonnegativity constraints. Then alternate between W and H. If the matrix M satisfies a separability condition, then we can solve the NMF problem efficiently. By definition a matrix M is rseparable if there exists a nonnegative factorization (exact factorization) of rank r, where each column of W is equal to a column of M. Meaning that each column of W, being a basis for the column space of M, appears somewhere in the data matrix M as its column.
Geometrically, the columns of W are the vertices of the convex hull of the columns of M. The separability condition means that all columns of M can be reconstructed by using a convex combination of r columns of W [14, 15]. This is only possible if the columns of M form a simplex which is spanned by r columns of M. This is not necessarily the case.
NMF in the context of measurement data Given a componentwise nonnegative matrix M of dimension \(n \times m\) and an integer \(r>0\), NMF determines likewise componentwise nonnegative matrices W and H of dimensions \(n \times r\) and \(r \times m\), respectively, such that \(M=WH\). Generally, integer r is denoted as rank of the factorization. Assuming M to represent m measurements of n nonnegative variables, we interpret the NMF task as follows: we aim to identify r ingredients which allow for recovery of all m measurements by composition according to respective contributions. The ingredients then are reflected by the columns of factorization matrix W, while the columns of H contain the corresponding mixing coefficients.
In practice, considering measured data, and therefore allowing noise or other forms of data uncertainty, generally rules out the existence of an exact NMF in terms of \(M=WH\). Thus, from now on, we want to compute componentwise nonnegative matrices W and H such that WH is an approximation of M.
In the context of Raman data spectral analysis, focusing on the nonnegativity of involved matrices becomes reasonable through the model for timeresolved Raman spectral data of Luce et al. [5]. They introduce an approach to express a series of spectral recordings of a chemical reaction (matrix M) as the matrix product of the component spectra (matrix W) and the evolution of relative concentrations of these reaction components (matrix H). Based on this model and synthetic spectral data, which satisfy the recently muchcited separability assumption, the authors of [5] furthermore present an algorithm to detect a factorization \(WH=M\) using separable NMF methods.
Inspired by their results, we propose a novel method, which does not rely on the separability assumption, since in the context of a spectral analysis this assumption is very restrictive. The separability assumption means that the convex hull of the columns of M is given by the column vectors of W. This is not necessarily given in realworld data. In other words, this assumption means, that the convex hull of M is a simplex. Of course it is true that we are searching for a simplex that includes all columns vectors of M, but the convex hull of M needs not be a simplex. Thus, we will exploit additional chemical or physical model aspects in order to find the optimal simplex including the columns of M without separability assumption. The purpose of this new approach is using adaptable objective function, taking into account only the common structural properties of the soughtfor, process defining matrices W and H.
Solving an optimization problem for NMF
In the following we pick up the concepts of both previous chapters as we introduce a new NMF approach which is specialized on analysis of timeresolved Raman spectral data. Recall from (1) that the recovered nonnegative matrices represent the component spectra of the involved species (W) and the reaction kinetics in terms of the evolution of relative concentrations (H). Our novel NMF approach differs from the methods discussed so far as it is mainly based on minimization of an objective function which directly incorporates all known structural properties of the soughtfor matrices W and H. Furthermore, our approach is unaffected by the restrictive separability assumption. In contrast to Luce et al. [5], we apply our method even to nonseparable measurement data. Additional flexibility and adaptability of the novel approach will be depicted in the numerical results in Sect. 4. Here we present the leading ideas of this approach as well as the details of the corresponding computational method.
Optimization criteria for NMF
In the following we propose a novel approach which is based on an objetive function which includes the needed structural properties of the soughtafter matrices W and M.
Claims on the matrices W and H In the following we assume that the component spectra are positive, such that W is a positive matrix. The componentwise nonnegativity of the kinetics H is also reasonable, since relative concentrations are, in general, nonnegative. Furthermore, because of representing relative concentrations, each column of H is a priori supposed to sum up to 1.
For each of the s chemical species the relative concentration is given by the relative concentration function \(h_s\):
describing the relative concentration of species s at time \(t \in \left[ 0,T \right]\) of the considered reaction.
Since the concentrations \(h_s(t)\) are relative we have
By using m time steps for discretization of the concentration functions \(h_s(t)\) we obtain the column stochastic matrix
The sequential Ramanmeasurements cannot be modelled as a “random picking of spectra”. The temporal order of measurements is important. Let the columns of H be given by \(h(t_i), i=0,\ldots ,m1\), i.e.
Given the initial “concentrations” \(h(t_{i1})\) there is a kinetics (or some Markov process) providing the concentrations of the next timestep \(h(t_i)\). This can be modelled by assuming a transition matrix P for the autonomous Markov process, if the time intervals are always constant. Thus, we claim that there exists a (row) stochastic matrix \(P \in {\mathbb {R}}^{r\times r}\) such that
In other words, the change of the relative concentration between the time steps can be interpreted as a Markov process. The construction of this matrix P will be explained later.
Summing up the objective function in our approach has the following penalty terms

(i)
W is componentwise nonnegative,

(ii)
H is componentwise nonnegative,

(iii)
H is column stochastic,

(iv)
P is componentwise nonnegative, and

(v)
P is row stochastic.
Summing up, we arrive at the following objective function
It has to be mentioned here, that the constraint (iv) is not necessarily valid. The matrix P has to be rowstochastic, however, the entries of P can be negative. A Galerkin projection of a Markov Process on the basis of microstates to a small set of macrostates can lead to negative entries in the projected matrix P. In the realworld example in Sect. 4.3, we will show a crystallization process with a nonexponential decay of one species, which leads to a matrix P with one negative entry.
Robust Perron cluster analysis (PCCA+) In the computational method of our novel NMF approach, we apply the Robust Perron Cluster Analysis (PCCA+) [16] to generate an initialization of the kinetics in matrix H. We thus briefly introduce intention and operating principles of PCCA+ and reveal its utility for our context.
PCCA+ belongs to the family of algorithms for characterizing objects of similar behaviour to combine them into a certain number of clusters. In several areas of computational life science this kind of task plays a versatile role. PCCA+ arises from investigation of molecular conformation dynamics and identification of metastable conformations [17, 18]. There, metastable conformations are clusters for which the large scale geometric structure of the observed ensemble is conserved under the influence of a spatial transition operator [19]. Translating this approach into terms we consider a stochastic matrix \(T \in {\mathbb {R}}^{N\times N}\) (representing the discretized version of the spatial transition operator) and we search for a nonnegative matrix \(Y \in {\mathbb {R}}^{N\times N_C}\), which columnwise contains the clusters \(y_i ,\; i=1, \dots ,N_C\), and thus satisfies three requirements: Y is nonnegative and row stochastic, in order to meet the partitionofunity constraint. Thirdly the vectors \(y_i\) build an eigenvalue cluster near 1.0 of T. This means for each \(i=1, \dots ,N_C\) we have
The main idea of PCCA+ is to generate Y as a linear transformation of the matrix \(X\in {\mathbb {R}}^{N\times N_C}\), where X columnwise contains the \(N_C\) first eigenvectors of T with respect to eigenvalues close to \(\lambda _1 = 1\). PCCA+ therefore computes a nonsingular transformation matrix \({\mathcal {A}}\in {\mathbb {R}}^{N_C \times N_C}\) in order to gain the nonnegative, row stochastic matrix Y via
Above, in paragraph matrix properties, we claimed that the soughtfor matrix H of reaction kinetics needs to be nonnegative and column stochastic. Both requirements are satisfied if we consider (4) and choose \(H=Y^T\) as an initial guess of the kinetics. Thus, in the computational method of our novel NMF approach, the preprocessing prepares the application of PCCA+ in order to generate a promising initialization of H.
Solving for \({{\mathcal {A}}}\) (4), we may find several feasible solutions \({\mathcal {A}}\in {\mathbb {R}}^{N_C \times N_C}\) providing an appropriate matrix Y. PCCA+ tackles this issue by computing \({\mathcal {A}}\) through solving an optimization problem with respect to a certain objective function. Given that the stochastic matrix T is the discretization of a transition operator (consider e.g. molecular conformation dynamics), maximization of this objective function is equivalent to the maximization of metastability between the generated clusters. In other contexts (consider e.g. geometrical cluster problems) the interpretation of the objective functional may be different while still meaningful. See [17, 20, 21] for exemplary applications and illustrations of PCCA+ in several research areas.
Computational method
The main work stages in the computational method of our novel NMF approach are summarized in Algorithm 1. Note that we distinguish between the finally recovered matrices (denoted as \(W_{rec}\) and \(H_{rec}\)) and their corresponding interim results (denoted as \({\widetilde{W}}\) and \({\widetilde{H}}\)). Furthermore, we use matlab method pinv to calculate pseudoinverses of singular or even nonsquare matrices. We then label the pseudoinverse of a matrix A as \(A^{\dagger }\). Furthermore, with \(A_+\) we denote the matrix which is constructed out of A by deleting the first row and \(A_\) is the corresponding matrix constructed out of A by deleting the last row.

Step 1: Preprocessing In the preprocessing we consider \(M^T\). By subtraction of a reference point we transfer the columns of \(M^T\) into a linear space. Afterwards we perform singular value decomposition (SVD) such that we gain \(M^T = U\Sigma V^T\). In order to initialize \({\widetilde{H}}\) we want to apply PCCA+ to the leading \(r1\) columns of U. Thus we build a matrix \({\mathcal {U}}\), which takes the role of X in (4), as follows: The first column of \({\mathcal {U}}\) is equal to \(e=\left[ 1, \ldots , 1 \right] ^T \in {\mathbb {R}}^m\), which is a requirement of PCCA+. We then stock up with columns \(1, \ldots , r1\) of U until \({\mathcal {U}}\in {\mathbb {R}}^{m\times r}\). Subsequently, for efficiency reasons of PCCA+, we ensure orthogonality among the columns of \({\mathcal {U}}\) [16].

Step 2: Initializing \({\widetilde{H}}\), \({\widetilde{W}}\), and \({\widetilde{P}}\) We apply PCCA+ to \({\mathcal {U}}\). According to (4), we obtain a nonnegative, column stochastic matrix \({\widetilde{H}}\) setting
$$\begin{aligned} {\widetilde{H}} = \left( {\mathcal {U}} {\mathcal {A}}\right) ^T \; \in {\mathbb {R}}^{r\times m}, \end{aligned}$$(5)whereby \({\mathcal {A}}\in {\mathbb {R}}^{r\times r}\) is the computed PCCA+ transformation matrix. \({\widetilde{H}}\) is our initial guess of the kinetics of relative concentrations. Accordingly, we gain an initialization of the component spectra \({\widetilde{W}}\) through the relation
$$\begin{aligned} M&= {\widetilde{W}} {\widetilde{H}} \nonumber \\ \Leftrightarrow \qquad {\widetilde{W}}&= \ M {\widetilde{H}}^{\dagger } = M \left( {\mathcal {A}}^T {\mathcal {U}}^T \right) ^{\dagger } \; \in {\mathbb {R}}^{n\times r} . \end{aligned}$$(6)In (2), we can see that the matrix \({\widetilde{P}}\) is given by
$$\begin{aligned} {\widetilde{P}}&=(({\widetilde{H}}_)^T)^{\dagger }({\widetilde{H}}_+)^T \nonumber \\&= {{{\mathcal {A}}}}^{1}\big ({{{\mathcal {U}}}}_^\dagger {{{\mathcal {U}}}}_+\big ){{{\mathcal {A}}}}. \end{aligned}$$(7)Regarding (5), (6), and (7) we express the initial guesses of the soughtfor matrices only in terms of the given and processed data (M, \({\mathcal {U}}\)) and the PCCA+ transformation matrix (\({\mathcal {A}}\)).

Step 3: Minimizing objective function The objective function of our novel NMF approach only incorporates structural properties of the soughtfor matrices, as discussed above in paragraph matrix properties. With respect to each property we estimate a penalty value as stated in the following expressions:
$$\begin{aligned} \left. \begin{aligned} \text {Penalty 1:} \qquad&\alpha \left( \min \limits _{i,j} \; {\widetilde{W}}_{ij} \right) \qquad \qquad \\ \text {Penalty 2:} \qquad&\beta \left( \min \limits _{i,j} \; {\widetilde{H}}_{ij} \right) \qquad \qquad \\ \text {Penalty 3:} \qquad&\gamma \left( \max \limits _j \; \sum \limits _{i=1}^r \; {\widetilde{H}}_{ij} 1 \right) \qquad \qquad \\ \text {Penalty 4:} \qquad&\delta \left( \min \limits _{i,j} \; {\widetilde{P}}_{ij} \right) \qquad \qquad \\ \text {Penalty 5:} \qquad&\mu \left( \max \limits _j \; \sum \limits _{j=1}^r \; {\widetilde{P}}_{ij} 1 \right) \qquad \qquad \\ \end{aligned} \right\} \end{aligned}$$(8)In regard to nonnegativity of light intensities and relative concentrations, penalties 1, 2, and 4 determine the smallest entries in matrices \({\widetilde{W}}\), \({\widetilde{H}}\), and \({\widetilde{P}}\). As the sum of penalty values is supposed to increase if these smallest entries appear to be negative, weighting coefficients \(\alpha\), \(\beta\), and \(\delta\) are generally chosen negative, too. For \({\widetilde{H}}\) to be column stochastic, the maximal deviation from its correct column sum is penalized in Penalty 3. Whereas, the requirement on \({\widetilde{P}}\) to be row stochastic is regarded by computing the maximal deviation of a column sum from being equal to 1.0 in Penalty 5.
Consider \(\Psi\) to represent the sum of penalty values. As we choose the relations (5) and (6) for initialization, the input arguments for the objective function are the matrices M, \({\mathcal {U}}\) and \({\mathcal {A}}\). Since we perform optimization with respect to parameter \({\mathcal {A}}\), the minimization problem can be written in the form
$$\begin{aligned} \min \limits _{{\mathcal {A}}\in {\mathbb {R}}^{r\times r}} \; \Psi ^2 . \end{aligned}$$Minimizing \(\Psi ^2\) hence numerically adjusts matrices \({\widetilde{W}}\) and \({\widetilde{H}}\) according to the claimed structural properties. For computation we apply matlab method fminsearch, which uses the simplex search method of Lagarias et al. [22].

Step 4: Recovering \(W_{rec}\), \(H_{rec}\), and \(P_{rec}\) The minimization in Step 3 finally returns a transformation matrix \({\mathcal {A}}_{\text {opt}}\). We then recover the resulting kinetics \(P_{rec}\) of relative concentrations \(H_{rec}\) and the component spectra \(W_{rec}\) according to (5)–(7) as
$$\begin{aligned} H_{rec}&= \left( {\mathcal {U}} {\mathcal {A}}_{\text {opt}} \right) ^T = {\mathcal {A}}_{\text {opt}}^T {\mathcal {U}}^T \; \in {\mathbb {R}}^{r\times m} , \\ W_{rec}&= M H_{rec}^{\dagger } = M \left( {\mathcal {A}}_{\text {opt}}^T {\mathcal {U}}^T \right) ^{\dagger } \; \in {\mathbb {R}}^{n\times r}, \\ P_{rec}&= {{{\mathcal {A}}}_{\text {opt}}}^{1}\big ({{{\mathcal {U}}}}_^\dagger \mathcal{U}_+\big ){{{\mathcal {A}}}_{\text {opt}}} \; \in {\mathbb {R}}^{r\times r}. \end{aligned}$$
In regard to NMF in the context of Raman data spectral analysis, our novel approach offers two main advancements: Firstly, in contrast to the method of Luce et al. [5], our novel NMF approach is unaffected by the separability assumption. Since we only consider the general properties of the soughtfor matrices without further demands on the input data, we may apply the novel approach to the broader range of even nonseparable spectral data. Secondly, note the possibility to manipulate the decicive objective function in Step 3 by the choice of weighting coefficients \(\alpha , \beta ,\gamma ,\delta\) and \(\mu\) or by addition of further penalty terms. This flexibility and adaptability of our method allows, for special focus on certain data properties or even extension of the recovery objectives. We remark that the approach of optimizing \(P_rec\) has already been suggested in [23] and recently (7) has been appiled in [24].
The next section presents some numerical experiments.
Numerical results
In this section we present the level of performance of our novel NMF approach by applying it to a sequence of artificial timeresolved Raman spectral data. After describing the reaction data generation in Sect. 4.1, we prove that the component spectra are recovered to a high quality and that we even reach meaningful approximations of the underlying reaction kinetics. As well in Sect. 4.2, we present the effectiveness of our method in the case of increased overlap among the individual component spectra and the occurrence of measurement noise. In Sect. 4.3, we present realword data from Raman spectroscopy measured during a crystallization process of paracetamol in ethanol. We show that our method can help to identify and characterize intermediate states (and their lifetimes) of a chemical process.
Description of the reaction data generation
As in Sect. 2, for the model of timeresolved Raman spectral data, we here again follow the framework of Luce et al. [5].
Regarding the generation of artificial timeresolved Raman spectral data we consider a reaction scheme with five involved species A, B, C, D and E which are interrelated by firstorder reactions. These firstorder reactions are characterized by a rate matrix of transition coefficients as follows:
The rows \(i =1, \dots ,5\) of K reflect the transition behaviour of the corresponding species in the course of the observed reaction. So \(K_{12}\) says that 53% of the amount of species A merge into species B per arbitrary unit of reciprocal time. The diagonal entries of K represent the sum of relative loss of each species per time unit. Thus we already notice species D to be the only product of this modeled reaction as just this species exclusively absorbs rates. Here, we let species A be the only educt of the reaction and therefore denote the initial concentration vector as \(h_0 :=h(t_0) =~\left[ 1, 0, 0, 0, 0 \right] ^T\). With \(h_0\) and rate matrix K we obtain the reaction kinetics as a function of time by
where \(h_i (t)\) denotes the relative concentration of species i at time t. The resulting kinetics are displayed in Fig. 2 (right). We gain the corresponding matrix H of kinetics by discretization of h(t) at equidistant time steps \(t_0, \dots , t_{m1}\) such that \(H=~\left[ h(t_0), \dots , h(t_{m1}) \right]\).
The single component spectra are built up as arbitrary sums of Lorentzians, which we illustrate in Fig. 2 (left). The five columns of matrix W accordingly contain the discretized intensitybywavenumber signals.
The spectral overlap among the single component spectra is adjustable. This means we may increase the level of spectral interference by moving all base points \(x_0\) of the generated Lorentzians towards certain focal points. The level of spectral interference decides the level of separability of the measurement data. While the results in [5] are based on nearseparability because of low spectral interference, we prove the effectiveness of our method even in the case of high interference among the component spectra.
The resulting measurement data matrix M is obtained as the product of matrix W of component spectra and matrix H of the underlying reaction kinetics as \(M=WH\). See Fig. 3 (top) for an interpolated visualizatoin of M.
Recovery results
Considering the measurement data, according to the artificial reaction scheme as introduced in the previous Sect. 4.1, our goal is now to recover the single component spectra as well as the reaction kinetics only given matrix M. In other words, we compute matrices \(W_{rec}\) and \(H_{rec}\) by applying our novel NMF approach to M. We thereby are especially interested in the reconstruction of the true component spectra W in order to provide a powerful tool for compound identification in reallife Raman spectral analysis. Recall that the objective function in our approach is based on adding up the penalty terms in (8), which represent the structural properties of the soughtfor matrices and which are weighted by choice of the coefficients \(\alpha , \beta\) and \(\gamma\). In this section we present the results of our method for the predefinitions
Recall additionally that we applied singular value decomposition in the preprocessing of our computational method. That is why the order of species in the recovered matrices \(W_{rec}\) and \(H_{rec}\) may be permuted in comparison to the order in the exact matrices W and H. For comparative visualization of our recovery results, we thus compute the correlation coefficients between the columns (\(\sim\) species) of \(W_{rec}\) and W and associate the spectra as well as the reaction kinetics according to the maximal correlation values.
Exemplary recovery results of our novel method for the noiseless case with low spectral interference are displayed in Fig. 4. Especially the recovery of components A, B and D is nearly exact: the coordinates as well as the heights of peaks, can hardly be distinguished visually from the original data. In the bottom right panel we also present the recovery result for the matrix H of reaction kinetics.
As in all upcoming illustrations of the reconstructed kinetics, the dotted lines are assigned to their species through the corresponding color in the spectral panels. For comparison, the exact kinetics (black lines) represent the kinetics from Fig. 2 (right). Indeed our reconstructed kinetics in Fig. 4 reflect the general trends of the exact kinetics as in particular species A is recognized to be the only educt, and species D to be the exclusive product of the generated reaction scheme.
As the first extension of the data setting, we now investigate the effectiveness of our method in the case of increased spectral interference. As mentioned in Sect. 4.1, we generate increased spectral interference among the component spectra in W by moving the base points \(x_0\) in all species towards three focal points. We then obtain component spectra as displayed in Fig. 5.
In Fig. 6 we present the results of our novel approach being applied to very interferencerich measurement data. Besides the remaining high quality in the recovery of components A, B and D, the reconstruction of species C and E apparently improved compared to the results in Fig. 4. In this interferencerich case our method computes the coordinates of the peaks in all component spectra quite satisfactorily. Concerning the recovery of the reaction kinetics, displayed in the bottom right panel, we again precisely identify the educt and the product of the reaction.
As the second extension of our data setting we regard the recovery results of our routine additionally considering contamination of measurement noise. In any practical setting Raman spectral analysis needs to deal with this issue since, for instance, signal shot noise or background noise appear in any real experimental data. Here we assume the noise from all different sources to be adequately represented by additive Gaussian white noise, which disturbs the measurement matrix M according to
The entries of N thereby are generated by the normal distribution \({\mathcal {N}} (0,1)\) and \(\delta =0.5\) is the relative noise level. See Fig. 3 (bottom) for an interpolated visualization of the interferencerich and noisy measurement matrix \({\tilde{M}}\). Applying our novel NMF approach with the predefinitions in (9) to \({\tilde{M}}\), the illustrations of results in Fig. 7 prove that the component spectra still show a reasonable agreement with the exact spectra. Furthermore, the main traits of the true reaction kinetics are recognizable in the recovered kinetics as well.
Example: paracetamol in ethanol
We took experimental timeresolved Raman spectroscopy data of paracetamol as an example to demonstrate application and usability of our NMF algorithm. Paracetamol crystallizes in different forms (paracetamol is a polymorph). The forms have different properties when processing the drugs in their final tablet formulation. The bioavailability of the drug can also be different according to a particular form [25]. Control over crystallization is, thus, required in an attempt to manufacture suitable tablets. It is important to study crystallization in an empirical manner with different solvents, cooling rate, etc. One important aspect is the choice of solvents. Different solvent choices yield different polymorphs of paracetamol [26]. Crystallization studies from liquid solutions were performed in a custommade acoustic levitator [27], i.e., the droplet of the solution can be fixed in a stable and undisturbed position by means of an ultrasonic field. The acoustic levitator allows executing contactfree crystallization studies and in situ measurements. The environment around the sample can be controlled regarding the surface, temperature, and humidity by passing a cool/hot stream of nitrogen. During the experiment the solvent evaporates and leads to a gradual increase of the concentration of the droplet which finally crystallizes (Fig. 8). Timeresolved Raman spectroscopy is performed with the resolution of 3 s during this crystallization process. Various pathways from solution phase of the drug molecules to final crystallized phase have been suggested. An intermediate metastable polyamorphic state has been reported wherein the paracetamol molecules existing in transient disorganised cluster undergoes ordering to fetch final crystal structure of high order [28]. With our method, we were able to not only understand the kinetics of the intermediate phase, but were also able to calculate the spectra of the intermediate state. This data is crucial in understanding and thus controlling the crystallization of a drug substance. The measurements are shown in Fig. 9.
The following settings are used for the optimization function: \(\alpha =0.00001, \beta =100, \gamma =100, \delta =1, \mu =1\). With these settings it is focused on feasible concentrations. This means, we focus on providing a matrix \(H_{rec}\) with nonnegative entries and column sum 1, such that Fig. 11 shows mathematically feasible concentration curves. \(\alpha\) is set to a very low value, because the intensities of the spectra are orders of magnitude higher than the entries in \(H_{rec}\) or \(P_{rec}\). After using the optimization approach Algorithm 1, especially the matrices \(H_{rec}\) and \(W_{rec}\) are important experimental findings. They show the spectra of intermediate steps and of the final crystal form of paracetamol (Fig. 10) and they show the kinetics of the crystallization process (Fig. 11). The matrix \(P_{rec}\) is:
This matrix represents the approximated Galerkin projection (3 states) of a transition process in a continuous space (micorscopic 3D arrangement of the atoms in the droplet). The third row of \(P_{rec}\) represents the initial state. The second row is the intermediate state. There is a zero probability for going back from this state to the initial state. The first row represents the stable final crystal. The upper right part of \(P_{rec}\) is zero. This is because the crystallization process is directed. Figure 11 shows a decay of the initial state which is nearly linear. In reaction kinetics we usually expect exponential decay. The matrix is just the optimal fit to a presumed kinetics according to the chosen objective function. Depending on the optimization criterion, one can obtain different results from NMF of the given raw Raman spectroscopy data. These results can be checked using a crossvalidation method to confirm the mathematical interpretation of the chemical process. We compared the results of NMF with simultaneous timelapse photography of the droplet, the first of its kind to be used as a watchdog for comparing results obtained from NMF that correspond to the experimental results. Besides comparing timestep of phase change point observed in concentration curves with the experimental timesteps, another factor that validates the results are the peaks reported for metastable intermediate amorphous state closely matches with our calculated spectra. The peaks in red curve, for measured intermediate state, 1236 cm^{−1},1326 cm^{−1},1618 cm^{−1} to refer to few of many, match with calculated peaks at 1235 cm^{−1}, 1327 cm^{−1},1619 cm^{−1} [28]. Naturally, the peaks for final moieties can also be verified and are in accordance with reported experimental data. Structural changes, which are predicted with NMF are verified on the basis of this recording.
Conclusion
Summarizing, our novel NMF approach returns remarkable and robust results in the recovery of component spectra and reaction kinetics while the method is mainly based on the general structural properties of the soughtfor matrices. The recovery results of our approach even indicate that the quality of the recovered component spectra improves as the spectral overlap among the component spectra increases. Our approach can therefore be considered as a complement to the method of Luce et al. [5] since the success of their method especially depends on low spectral interference (nearseparability of M).
References
 1.
J.R. Ferraro, K. Nakamotot, C.W. Brown, Introductory Raman Spectroscopy, 2nd edn. (Academic Press, Amsterdam, 2003)
 2.
Y.S. Li, J.S. Church, Raman spectroscopy in the analysis of food and pharmaceutical nanomaterials. J. Food Drug Anal. 22(1), 29–48 (2014)
 3.
A. Kudelski, Analytical applications of Raman spectroscopy. Talanta 76(1), 1–8 (2008)
 4.
S.K. Sahoo, S. Umapathy, A.W. Parker, Timeresolved resonance Raman spectroscopy: exploring reactive intermediates. Appl. Spectrosc. 65(10), 1087–1115 (2011)
 5.
R. Luce, P. Hildebrandt, U. Kuhlmann, J. Liesen, Using separable nonnegative matrix factorization techniques for the analysis of timeresolved Raman spectra. Appl. Spectrosc. 70(9), 1464–1475 (2016)
 6.
D. Guillamet, J. Vitrià, Nonnegative matrix factorization for face recognition, in Topics in Artificial Intelligence ed. by M. Teresa Escrig, F. Toledo, E. Golobardes, (Springer, Berlin, 2002), pp. 336–344
 7.
W. Xu, X. Liu, Y. Gong, Document clustering based on nonnegative matrix factorization, in Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (ACM, 2003), pp. 267–273. https://doi.org/10.1145/860435.860485
 8.
K. Devarajan, Nonnegative matrix factorization: an analytical and interpretive tool in computational biology. PLoS Comput. Biol. 4(7), 1–12 (2008)
 9.
N. Gillis, R. Luce, Robust nearseparable nonnegative matrix factorization using linear optimization. J. Mach. Learn. Res. 15, 1249–1280 (2014)
 10.
S.A. Vavasis, On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20(3), 1364–1377 (2010)
 11.
R. Hansen, A numerical method for solving Fredholm integral equations of the first kind using singular values. SIAM J. Numer. Anal. 8, 616–622 (1971)
 12.
C. Lin, Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)
 13.
M.W. Berry, M. Browne, A.N. Langville, V.P. Pauca, R.J. Plemmons, Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52, 155–173 (2007)
 14.
D. Donoho, V. Stodden, When does nonnegative matrix factorization give a correct decomposition into parts?, in Advances in Neural Information Processing Systems, vol. 16, ed. by S. Thrun, L.K. Saul, B. Schölkopf (MIT Press, Amsterdam, 2004), pp. 1141–1148
 15.
S. Arora, R. Ge, R. Kannan, A. Moitra, Computing a nonnegative matrix factorization—provably, in Proceedings of the FortyFourth Annual ACM Symposium on Theory of Computing, Ser. STOC ’12, (Association for Computing Machinery, New York, NY, USA, 2012), pp. 145–162
 16.
M. Weber, Meshless Methods in Confirmation Dynamics, Ph.D. dissertation, Freie Universität Berlin (2006)
 17.
P. Deuflhard, M. Weber, Robust Perron cluster analysis in conformation dynamics. Linear Algebra Appl. Spec. Issue Matrices Math. Biol. 398, 161–184 (2005)
 18.
M. Weber, T. Galliat, Characterization of transition states in conformational dynamics using fuzzy sets. Zuse Institut Berlin (ZIB), Technical Report 02–12 (2002)
 19.
C. Schütte, Conformational Dynamics: Modelling, Theory, Algorithm, and Application to Biomolecules, Habilitation Thesis, Freie Universität Berlin (1999)
 20.
M. Weber, S. Kube, Robust Perron Cluster Analysis for Various Applications in Computational Life Science, Zuse Institut Berlin (ZIB). Technical Report 06–01 (2005)
 21.
K. Fackeldey, M. Weber, GenPCCA—Markov State Models for NonEquilibrium Steady States. Big data clustering: data preprocessing, variable selection, and dimension reduction. WIAS Report No. 29, pp. 70–80 (2017)
 22.
J.C. Lagarias, J.A. Reeds, M.H. Wright, P.E. Wright, Convergence properties of the Nelder–Mead simplex method in low dimensions. SIAM J. Optim. 9(1), 112–147 (1998)
 23.
M. Weber, Implications of PCCA+ in molecular simulation. Computation 6(1), 20 (2018)
 24.
S. Gerber, L. Pospisil, M. Navandar, I. Horenko, Lowcost scalable discretization, prediction, and feature selection for complex systems. Sci. Adv. 6(5), eaaw0961 (2020)
 25.
J. Bauer, S. Spanton, R. Henry, J. Quick, W. Dziki, W. Porter, J. Morris, Ritonavir: an extraordinary example of conformational polymorphism. Pharm. Res. 18(6), 859–866 (2001)
 26.
R. Hilfiker, Polymorphism in the Pharmaceutical Industry (WileyVch, Weinheim, 2006)
 27.
M.C. Schlegel, K.J. Wenzel, A. Sarfraz, U. Panne, F. Emmerling, A wallfree climate unit for acoustic levitators. Rev. Sci. Instrum. 83(5), 2013–2016 (2012)
 28.
Y. Nguyen Thi, K. Rademann, F. Emmerling, Direct evidence of polyamorphism in paracetamol. CrystEngComm 17(47), 9029–9036 (2015)
Acknowledgements
S.C. acknowledges funding from the gruaduate school of excellence SALSA. M.W. acknowledges funding from the CRC 1114 Scaling Cascades in Complex Systems in project A05 “Probing scales in equilibrated systems by optimal nonequilibrium forcing”. K.F. would like to thank MATH+.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fackeldey, K., Röhm, J., Niknejad, A. et al. Analyzing Raman spectral data without separabiliy assumption. J Math Chem (2021). https://doi.org/10.1007/s10910020012017
Received:
Accepted:
Published:
Keywords
 Nonnegative matrix factorization
 NMF
 Raman spectra
 Separability condition
 PCCA+