Keywords

1 Introduction

Computational power gives very powerful support in the life sciences today. A lot of experiments can be done – they are cheaper to conduct, their parameters can be easily modified. They are also in most cases reproducible and ethical (no wronging living creatures).

According [1] the term modeling is defined as “to design or imitate forms: make a pattern” or “producing a representation or simulation” and model is defined as “a system of postulates, data, and inferences presented as a mathematical description of an entity or state of affairs”. In fact, in a case of in-silico experiments more precious would be “computational model” and “computational modeling”, but in this paper, it will be referred to in a shorter form. Sometimes term “modeling” is used in the context of “running computational model” – but in this paper, it will be referred to as “simulation”. For many disciplines creating a model is important – it allows to re-scale (extend or reduce) object, slow down or speed up modeled process, examine almost any aspect of object or process (separating parameters or taking into account a given quantity of parameters). With the use of computers, it is also possible to make visualizations and animations.

This paper describes some aspects of the modeling process that occurs in all organisms – precisely speaking occurs in almost every living cell. This process also occurs just right now – in my body while I’m writing this text, as well as in your – when you read this. This is the process of transferring genetic material, DNA– during cell division. This process is difficult to examine – we can only observe living cells during a relatively short time. Another difficulty here is its microscale – to observe it we have to use microscopes. And, besides of a scale – when we want to examine the interior of the cell – we have to destroy (and kill) it ...There are attempts to create an “artificial” [2] (or “synthetic” [3]) cell, but this is not an easy task. To face this up, using “divide and conquer” strategy there are attempts to create models of certain cell components and processes. This paper shows some new knowledge that we discover while trying to model chromosome territories (CT’s) being a final result of modeling and simulation chromatin decondensation (CD) process and documents some problems (and the way we took to solve them) to make the working model.

1.1 Motivation

Some time ago we are asked if we can help in the creation of a probabilistic model of CT’s (in short – CT’s are the distinct 3D space occupied by each chromosome after cell division, see also Sect. 1.2). We agreed and something that we supposed to be a project for a few months of work, becomes the true mine of many different problems to be solved.

The first one we focused on, was the problem of creating appropriate model of chromatin and the model of the chromatin decondensation process (to be able to implement and simulate this process) in a phase just right after cell division.

1.2 Background

In eukaryotic cells, genetic material is not stored in a well-known form of a helix because DNA strand is too long (and too vulnerable to damage). It is stored as a complex of DNA strand and proteins – altogether called chromatin which is being rolled-up in a very sophisticated way [5]. This allows taking much less space and store DNA untangled. Probably it also helps in preventing random breaks and changes in DNA sequences. Researches concerning chromatin organization are important because of its influence on gene transcription [6].

There are levels of chromatin organization (depending on the level of packing) ([4, 7]). The two extreme levels of packing are condensed and decondensed ones [11]. The one – somewhere in between extreme ones, that we are interested in, is called euchromatin. This level of organization is often referred to as “beads on a strand” (see Fig. 1).

Fig. 1.
figure 1

Euchromatin – beads on strand

The level of chromatin condensation depends on different factors. It can be cell state (during cell cycle) but it is also known that it can be controlled by epigenetic modifications [8] or biological process [10]. The risk of DNA damage [9] or modification varies depending on the chromatin condensation level.

During the cell division, chromatin fibers condense into structures called chromosomes. In the period between two subsequent divisions, called interphase, chromosomes decondense and occupy 3-D areas within the nucleus. Those distinct areas – called “chromosome territories” (CT’s) – are regarded as a major feature of nuclear structure ([12, 22]). Chromosome territories can be visualized directly using in-situ hybridization with fluorescently labeled DNA probes that paint specifically individual chromosomes ([18, 20]). Researches concerning CT’s are: studying the relationship between the internal architecture of the cell nucleus and crucial intranuclear processes such as regulation of gene expression, gene transcription or DNA repair ([17, 19, 21]). Those studies are related to spatial arrangement, dynamics (motion tracking) [13], frequency of genomic translocations ([14]) and even global regulation of the genome [15]. Possibility of making experiments in-silicowould speed up and make some of the experiments easier and cheaper.

2 Euchromatin Model and Chromatin Decondensation Process Modeling

The euchromatin was the starting point to model chromatin structure for us: we decided to model chromatin (and arms of chromosomes) as a sequence of tangent spheres (Fig. 2) – visually very similar to euchromatin (see Fig. 1). Because euchromatin is observed as “beads on a strand” and beads (sometimes also called “domains”) are its basic structural units, we decided to make a single sphere our basic part of the chromatin chain component (and the basic structural units building up CTs). This allows also to make our model scalable – by changing the size of the sphere we can easily change the level of chromatin packing. A sphere can be also easily rendered as graphical primitive in most graphical libraries which were very important to guarantee the possibility of further CT’s visualization. Our modeling process was very closely related to geometrical, visible objects, because it was very important, that the final models could be visualized – to allow visual comparison with real images from confocal microscopy.

Fig. 2.
figure 2

Euchromatin model as a tangent spheres

We also decide to model the decondensation process by adding tangent spheres around existing ones. This effects in gradually expanding volume of the initial strand of spheres. The process continues until the stop condition was met (volume or size of decondensed chromatin).

The computational problem was as follows: starting from the initial (condensed) chromatin model (in a form “beads-on-strand”), consisting of a sequence of mutually tangent spheres find coordinates for next N spheres (where N denotes the size –number of beads of chromatin after decondensation). Geometrically it is a problem of finding (xyz) being the center of a new sphere with the condition of being tangent to the previous one and not in collision in any other (previously generated).

Our first goal was to make a fully probabilistic model – that means that we do not add additional conditions like the position of centromeres, telomeres, nucleoplasm stickiness and so on (extending model and making it more “real data-driven” are in our current field of interest and research). The modeled process of decondensation can be somehow regarded as a Markov process – the subsequent state \(i+1\) of decondensation strictly depends on the previous one i.

The very basic component of our model was a sphere S((xyz), r). This notation should be read as a sphere S with a center in the point that has (xyz) coordinates and a radius with the length of \(r, (r\ge 0)\). The ordered chain of spheres – makes our model of a chromosome, a set of indexed spheres makes a model of CT.

The very general algorithm for CT’s modeling is presented in Algorithm 1. Line 4 and 5 reflect creating initial chromatin strand, line 6 simulation of decondensation. Altogether, they led to the generation of the model of the certain CT.

figure a

The last step of the algorithm (line 6) proved to be the most demanding and challenging, which is described in the next section.

3 Experiments and Results

In the following section, we document the way we take to successfully made the probabilistic model of CT’s.

3.1 Modeling Chromatin Decondensation with CC

At first, we used the Cartesian coordinates (CC). First, the algorithm generates coordinates for the sphere that are denoted as the centromere, and next add to it the next ones until it reaches the (given a-priori) length of arms for certain chromosome. Having model of the entire chromosome algorithm draw a id of one of the present spheres \(S_i((x_i, y_i, z_i),r)\) (from those composed the chromosome) and then draws “candidate coordinates”: \(x_{i+1}\), \(y_{i+1}\) and \(z_{i+1}\) for the center for the next sphere. The new coordinates are to be from limited range – not too far current sphere’s (as they should be tangent).

Fig. 3.
figure 3

Way of determining the location of \(S_{i+1}\) sphere using CC

To allow small flexibility, the \(\varepsilon \) value to the drawn coordinates was introduced. When we had coordinates drawn, the distance \(dist(S_i,S_{i+1})\) was calculated to check whether a new sphere can be added. The distance was computed by calculating ordinary Euclidean distance.

$$\begin{aligned} \left( S_i,S_{i+1}\right) = \sqrt{(x_{i+1}-x_i)^2+(y_{i+1}-y_i)^2+(z_{i+1}-z_i)^2} \end{aligned}$$
(1)

If \(dist(S_i,S_{i+1})\) was appropriate, the conditions to not collide with existing elements were checked. If all conditions are met – new sphere were added (for details see [26]).

There were no problems with the generation of the initial chromatin strand as a sequence of spheres (chromosome). The problem emerges when we tried to simulate the decondensation of chromatin: generation of a model takes a lot of time, and we noticed that sometimes simulation was unsuccessful. We discovered (after log analysis) that the algorithm got stuck trying to find coordinates for \(S_{i+1}\). So, we added additional function that triggers restart of algorithm after 500 unsuccessful attempts for placing \(S_{i+1}\) sphere (see Algorithm 2 lines 12–14). If \(S_{i+1}\) cannot be placed – algorithm starts over and searches possibility to add \(S_{i+1}\), but for another sphere forming chromosome.

The pseudocode for this version of the algorithm is shown in Algorithm 2. In the first step it generates the “candidate coordinates” for \(S_{i+1}\) center (Algorithm 2 lines 3–6). Thanks to \(\varepsilon \) a possibility that new sphere could be a little too far, or too close the previous \(S_i\). The fine-tuning is made by an additional function that checks the distance from the previous sphere (Algorithm 2 lines 7–8). Additional code for stuck detection that triggers restarting computations are in (Algorithm 2 lines 12–14).

figure b

This makes the simulation of CD process long and inefficient, and the result was disappointed: the algorithm got stuck relatively often. The measured number of necessary restarts to complete model creation is shown in Table 1.

Table 1. The number of restarts during simulating CD using CC

In one model creation, about 650 spheres should be placed as the tangent ones, so it was easy to asses the number of inefficient searches – they are presented in the last column of Table 1.

Table 2 showed the time needed to generate one CT model. Time was measured in seconds, basic statistics were also given, measurements were made on 40 generated models.

Table 2. Time of modeling CT’s with CC used in simulation of CD [in seconds, measurements from 40 models creation]

That was not a satisfactory result. We had to rethink the way we implement the decondensation of chromatin. We decided to try to add – at first sight – additional computations: shifting (change location) of the center of coordinate systems. Then we were able to use the notion of the neighborhood with a fixed radius (inspired by a topology) and use spherical coordinates (see Fig. 4).

We were aware of the fact that shifting the coordinate system takes additional time – but the solution with CC works so bad, that we hope that this approach will work a little better. The result of this change beats our expectations – which is described and documented in the next sections.

3.2 Modeling Chromatin Decondensation Process Using SC

We decided to try Spherical Coordinates (SC) [27] instead of CC (for those, who are not familiar with different coordinate systems we recommend to take a look at [28, 29]). When we wanted to add sphere \(S_{i+1}\) to the certain one \(S_i\), we first made a shift of the center of the coordinate system in such a way, that the center of coordinate system was situated in the middle of \(S_i\) sphere (see Fig. 4).

Fig. 4.
figure 4

Way of determining the location of \(S_{i+1}\) sphere using SC

This let us search for the \(S_{i+1}\) by drawing two angles and using just one parameter: 2r.

figure d

After switching to the SC, we got rid of the problem of looping the simulation during attempts of finding the location for the \(S_{i+1}\). Therefore, the function that restarts CT model creation could be removed.

We made measurements – time necessary to generate CT models (equivalent to the time of CD simulation) with shifting coordinate system and using spherical coordinates is presented in Table 3.

Table 3. Time of simulating CD process with the use of SC [in seconds, measured on 40 models]

Time of creating CT models decreases significantly in comparison to the use of CC. This had a direct and significant impact on the time of the model creation.

3.3 Comparison of Computational Time of CTmodeling with CCand SC

To follow the rigor for scientific publications (despite very clear difference between times showed in Table 2 and Table 3) we made an analysis, presented in this section. For the purpose of visual comparison of the times of CT model creation we prepared a boxplot (see Fig. 5) for general view.

In Fig. 5 the difference, in general, is easy to notice. There is even no single element of the chart (neither whiskers nor dots (outliers)) that overlaps each other.

Fig. 5.
figure 5

Time [seconds] of simulation chromatin decondensation – consolidated comparison of CC and SC coordinates used [from generation of 40 models]

It is easy to notice a huge difference between computing time (and its stability) in both cases.

For the record we made statistical test – the result is presented in Table 4. We calculated the value of the t-test, to confirm that the difference in creation times of model (CC and SC) is statistically significant (p-value below 0.05 means that the difference is statistically significant).

Table 4. Statistics for two sample t-test (modeling time with CC and with SC)

This proves the statistical significance between modeling time in described two methods.

4 Conclusions and Future Works

Based on presented in this paper results we can conclude that when you model in 3D space, using Spherical coordinates may lead to a more efficient implementation of the algorithm, even when you have to shift the center of the coordinate systems. The solution when using Euclidean distance in the Cartesian coordinate system in implementation was much more time-consuming. What is more important – it often does not finish modeling process in an acceptable time (sometimes we have to break simulation after 3 weeks of computing on a computer with 16 Gb RAM and i5 processor), if it finishes at all (do not got stuck).

As future work, knowing that using a spherical coordinate system is helpful we want to examine the effectiveness of quaternion-based implementation as a way to represent coordinates in 3D space. We also want to check in a more detailed way, what has an impact: only changing the center of the coordinate system, only changing the way of point representation – or both.

Because it is not the first time when we noticed significant change (in plus) after using Spherical (or hyperspherical – in more dimensions) Coordinates instead of the Cartesian ones, we plan (after finishing actual projects with deadlines) design and conduct a separate experiment. We want to investigate in a more methodological and ordered way to answer the question: why Spherical coordinates give better results in computational implementations?

Our case study also shows that it is possible that geometrical and visual thinking while modeling in 3D space can be helpful. With the “pure algebraic” thinking (based on the calculation on coordinates) finding the idea – to search in the neighborhood, shifting the center of the coordinate system and next using direction (angles) and fixed distance – would be more difficult (if even possible).