Keywords

1 Introduction

Interpolation and approximation techniques are used in the solution of many engineering problems. However, the interpolation of unorganized scattered data is still a severe problem. In the one dimensional case, i.e., curves represented as \( y = f\left( x \right) \), it is possible to order points according to the \( x \)-coordinate. However, in a higher dimensionality this is not possible. Therefore, the standard approaches are based on the tessellation of the domain in \( x,y \) or \( x,y,z \) spaces using, e.g. Delaunay triangulation [7], etc. This approach is applicable for static data and \( t \)-varying data, if data in the time domain are “framed”, i.e. given for specific time samples. It also leads to an increase of dimensionality, i.e. from triangulation in \( E^{2} \) to triangulation in \( E^{3} \) or from triangulation in \( E^{3} \) to triangulation in \( E^{4} \), etc. It results in significant increase of the triangulation complexity and complexity of a triangulation algorithm implementation. This is a significant factor influencing computation in the case of large data sets and large range data sets, i.e. when \( x,y,z \) values are spanned over several magnitudes.

On the contrary, meshless interpolations based on Radial Basis Functions (RBF) offer several significant advantages, namely:

  • RBF interpolation is applicable generally to \( d \)-dimensional problems and does not require tessellation of the definition domain

  • RBF interpolation and approximation is especially convenient for scattered data interpolation, including interpolation of scattered data in time as well

  • RBF interpolation is smooth by a definition

  • RBF interpolation can be applied for interpolation of scalar fields and vector fields as well, which can be used for scalar and vector fields visualization

  • If the Compactly Supported RBFs (CSRBF) are used, sparse matrix data structures can be used which decreases memory requirements significantly.

However, there are some weak points of RBF application in real problems solution:

  • there is a real problem for large data sets with robustness and reliability of the RBF application due to high conditionality of the matrix \( \varvec{A} \) of the system of linear equations, which is to be solved

  • numerical stability and representation is to be applied over a large span of \( x,y,z \) values, i.e. if values are spanned over several magnitudes

  • problems with memory management as the memory requirements are of \( O\left( {N^{2} } \right) \) complexity, where \( N \) is a number of points in which values are given

  • the computational complexity of a solution of the linear system, which is \( O\left( {N^{3} } \right) \), resp. \( O\left( {kN^{2} } \right) \), where \( k \) is a number of iteration if the iterative method are used, but \( k \) is relatively high, in general.

  • Problems with unexpected behavior at geometrical borders

Many contributions are solving some issues of the RBF interpolation and approximation available. Numerical tests are mostly made using some standard testing functions and restricted domain span, mostly taking interval \( \langle 0,1 \rangle \) or similar. However, in many physically based applications, the span of the domain is higher, usually over several magnitudes and large data sets need to be processed. Also large data sets are to be processed.

As the meshless techniques are easily scalable to higher dimensions and can handle spatial scattered data and spatial-temporal data as well, they can be used in many engineering and economical computations, etc. Polygonal representations (tessellated domains) are used in computer graphics and visualization as a surface representation and for surface rendering. In time-varying objects, a surface is represented as a triangular mesh with constant connectivity.

On the other hand, all polygonal based techniques, in the case of scattered data, require tessellations, e.g. Delaunay triangulation with \( O\left( {N^{{\left\lfloor {{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-0pt} 2} + 1} \right\rfloor }} } \right) \) computational complexity for \( N \) points in \( d \)-dimensional space or another tessellation method. However, the complexity of tessellation algorithms implementation grows significantly with dimensionality and severe problems with robustness might be expected, as well.

In the case of data visualization smooth interpolation or approximation on unstructured meshes is required, e.g. on triangular or tetrahedral meshes, when physical phenomena are associated with points, in general. This is quite a difficult task especially if the smoothness of interpolation is needed. However, it is a natural requirement in physically-based problems.

2 Meshless Interpolation

Meshless (meshfree) methods are based on the idea of Radial Basis Function (RBF) interpolation [1, 2, 22, 23], which is not separable. RBF based techniques are easily scalable to \( d \)-dimensional space and do not require tessellation of the geometric domain and offer smooth interpolation naturally. In general, meshless techniques lead to a solution of a linear system equations (LS) [4, 5] with a full or sparse matrix.

Generally, meshless methods for scattered data can be split into two main groups in computer graphics and visualization:

  • “implicit” – \( F\left( \varvec{x} \right) = 0 \), i.e. \( F\left( {x,y,z} \right) = 0 \) used in the case of a surface representation in E3, e.g. surface reconstruction resulting into an implicit function representation. This problem is originated from the implicit function modeling [15] approach,

  • “explicit” – \( F\left( \varvec{x} \right) = h \) used in interpolation or approximation resulting in a functional representation, e.g. a height map in E2, i.e. \( h = F\left( {x,y} \right) \).

where: \( \varvec{x} \) is a point represented generally in \( d \)-dimensional space, e.g. in the case of \( 2 \)-dimensional case \( \varvec{x} = \left[ {x,y} \right]^{T} \) and \( h \) is a scalar value or a vector value.

The RBF interpolation is based on computing of the distance of two points in the \( d \) –dimensional space and it is defined by a function:

$$ f\left( \varvec{x} \right) = \mathop \sum \limits_{j = 1}^{M} \lambda_{j} \varphi \left( {\left\| {\varvec{x} - \varvec{x}_{j} } \right\|} \right) = \mathop \sum \limits_{j = 1}^{M} \lambda_{j} \varphi \left( {r_{j} } \right) $$
(1)

where: \( r_{j} = \left\| {\varvec{x} - \varvec{x}_{j} } \right\|_{2}\, \underline{\underline{\text{def}}} \sqrt {\left( {x - x_{j} } \right)^{2} + \left( {y - y_{j} } \right)^{2} } \) (in 2-dimensional case) and \( \lambda_{j} \) are weights to be computed. Due to some stability issues, usually a polynomial \( P_{k} \left( \varvec{x} \right) \) of a degree k is added [6]. It means that for the given data set \( \left\{ \langle {\varvec{x}_{i} ,h_{i} }\rangle \right\}_{1}^{M} \), where \( h_{i} \) are associated values to be interpolated and \( \varvec{x}_{i} \) are domain coordinates, we obtain a linear system of equations:

$$ \begin{aligned} & h_{i} = f\left( {\user2{x}_{i} } \right) = \\ & \mathop \sum \limits_{{j = 1}}^{M} \lambda _{j} ~\varphi \left( {\left\| {\user2{x}_{i} - \user2{x}_{j} } \right\|} \right)~ + P_{k} \left( {\user2{x}_{i} } \right)\;\;\;\;\;\;\;\;\;i = 1, \ldots ,M\;\;\;\;\;\user2{x} = \left[ {x,y:1} \right]^{T} \\ \end{aligned} $$
(2)

For a practical use, a polynomial of the 1st degree is used, i.e. linear polynomial \( P_{1} \left( \varvec{x} \right) = \varvec{a}^{T} \varvec{x} \) in many applications. Therefore, the interpolation function has the form:

$$\begin{array}{*{20}l} {f\left( {\user2{x}_{i} } \right) = \mathop \sum \limits_{{j = 1}}^{M} \lambda _{j} ~\varphi \left( {\left\| {\user2{x}_{i} - \user2{x}_{j} } \right\|} \right) + \user2{a}^{T} \user2{x}_{\user2{i}} } \hfill & {h_{i} = f\left( {\user2{x}_{i} } \right)\;\;\;\;\;\;\;\;i = 1, \ldots ,M} \hfill \\ {\quad \quad \,\, = \,\mathop \sum \limits_{{j = 1}}^{M} \lambda _{j} ~\varphi _{{i,j}} + \user2{a}^{T} \user2{x}_{\user2{i}} } \hfill & {} \hfill \\ \end{array} $$
(3)

and additional conditions are to be applied:

$$ \mathop \sum \limits_{j = 1}^{M} \lambda_{i} \varvec{x}_{i} = \user2{0}\;\;\;\;\;{\text{i}}.{\text{e}}.\;\;\;\;\;\mathop \sum \limits_{j = 1}^{M} \lambda_{i} x_{i} = 0\;\;\;\mathop \sum \limits_{j = 1}^{M} \lambda_{i} y_{i} = 0\;\;\;\mathop \sum \limits_{j = 1}^{M} \lambda_{i} = 0 $$
(4)

It can be seen that for the \( d \)-dimensional case a system of \( \left( {M + d + 1} \right) \) linear system has to be solved, where M is a number of points in the dataset and \( d \) is the dimensionality of data. For \( d = 2 \) vectors \( \varvec{x}_{i} \) and \( \varvec{a} \) are in the form \( \varvec{x}_{i} = \left[ {x_{i} , y_{i} ,1} \right]^{T} \) and \( \varvec{a} = \left[ {a_{x} , a_{y} ,a_{0} } \right]^{T} \), we can write:

$$ \left[ {\begin{array}{*{20}c} {\varphi_{1,1} } & {..} & {\varphi_{1,M} } & {x_{1} } & {y_{1} } & 1 \\ : & \ddots & : & : & : & : \\ {\varphi_{M,1} } & {..} & {\varphi_{M,M} } & {x_{M} } & {y_{M} } & 1 \\ {x_{1} } & {..} & {x_{M} } & 0 & 0 & 0 \\ {y_{1} } & {..} & {y_{M} } & 0 & 0 & 0 \\ 1 & {..} & 1 & 0 & 0 & 0 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\lambda_{1} } \\ : \\ {\lambda_{M} } \\ {a_{x} } \\ {a_{y} } \\ {a_{0} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {h_{1} } \\ : \\ {h_{M} } \\ 0 \\ 0 \\ 0 \\ \end{array} } \right] $$
(5)

This can be rewritten in the matrix form as:

$$ \left[ { \begin{array}{*{20}c} \varvec{B} & \varvec{P} \\ {\varvec{P}^{T} } & \user2{0} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c}\varvec{\lambda}\\ \varvec{a} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} \varvec{f} \\ \user2{0} \\ \end{array} } \right]\;\;\;\;\;\;\varvec{Ax} = \varvec{b}\;\;\;\varvec{a}^{T} \varvec{x}_{\varvec{i}} = a_{x} x_{i} + a_{y} y_{i} + a_{0} $$
(6)

For the two-dimensional case and M points given a system of \( \left( {M + 3} \right) \) linear equations has to be solved. If “global” functions, e.g. \( \varphi \left( r \right) = r^{2} lg\, r \), are used, then the matrix \( \varvec{B} \) is “full”, if “local” functions CSRBFs are used, the matrix \( \varvec{B} \) can be sparse.

The RBF interpolation was originally introduced by Hardy as the multiquadric method in 1971 [5], which was called Radial Basis Function (RBF) method. Since then many different RFB interpolation schemes have been developed with some specific properties, e.g. 4 uses \( \varphi \left( r \right) = r^{2} lg\, r \), which is called Thin-Plate Spline (TPS), a function \( \varphi \left( r \right) = e^{{ - \left( { \in r} \right)^{2} }} \) was proposed in [23]. However, the shape parameter \( \in \) might leads to an ill-conditioned system of linear equations [26].

The CSRBFs were introduced as:

$$ \varphi \left( r \right) = \left\{ {\begin{array}{*{20}c} {\left( {1 - r} \right)^{q} P\left( r \right),} & { 0 \le r \le 1} \\ {0, } & { r > 1} \\ \end{array} \begin{array}{*{20}c} { \;\;\;\;\;} \\ { \;\;\;\;\;\;\;\;\;} \\ \end{array} } \right. $$
(7)

where: \( P\left( r \right) \) is a polynomial function and \( q \) is a parameter. Theoretical problems with numerical stability were solved in [4]. In the case of global functions, the linear system of equations is becoming ill conditioned and problems with convergence can be expected. On the other hand, if the CSRBFs are taken, the matrix \( \varvec{A} \) is becoming relatively sparse, i.e. computation of the linear system will be faster, but we need to carefully select the scaling factor \( \alpha \) (which can be “tricky”) and the final function might tend to be “blobby” shaped, see Table 1 and Fig. 1.

Table 1. Typical examples of “local” functions – CSRBF (“\( + \)” means – value zero out of \( \langle 0,1 \rangle \))
Table 2. Examples of testing functions
Fig. 1.
figure 1

Properties of CSRBFs

The compactly supported RBFs are defined for the “normalized” interval \( r \in 0,1 \), but for the practical use a scaling is used, i.e. the value \( r \) is multiplied by shape parameter \( \alpha \), where \( \alpha > 0 \).

Meshless techniques are primarily based on the approaches mentioned above. They are used in engineering problem solutions, nowadays, e.g. partial differential equations, surface modeling, surface reconstruction of scanned objects [13, 14], reconstruction of corrupted images [21], etc. More generally, meshless object representation is based on specific interpolation or approximation techniques [1, 6, 23].

The resulting matrix \( \varvec{A} \) tends to be large and ill-conditioned. Therefore, some specific numerical methods have to be taken to increase the robustness of a solution, like preconditioning methods or parallel computing on GPU [9, 10], etc. In addition, subdivision or hierarchical methods are used to decrease the sizes of computations and increase robustness [15, 16, 27].

It should be noted, that the computational complexity of meshless methods actually covers the complexity of tessellation itself and interpolation and approximation methods. This results in problems with large data set processing, i.e. numerical stability and memory requirements, etc.

If global RBF functions are considered, the RBF matrix is full and in the case of \( 10^{6} \) of points, the RBF matrix is of the size approx. \( 10^{6} \times 10^{6} \) ! On the other hand, if CSRBF used, the relevant matrix is sparse and computational and memory requirements are decreased significantly using special data structures [8, 10, 20, 27].

In the case of physical phenomena visualization, data received by simulation, computation or obtained by experiments usually are oversampled in some areas and also numerically more or less precise. It seems possible to apply approximation methods to decrease computational complexity significantly by adding virtual points in the place of interest and use analogy of the least square method modified for the RBF case [3, 12, 17, 25].

Due to the CSRBF representation the space of data can be subdivided, interpolation, resp. the approximation can be split to independent parts and computed more or less independently [20]. This process can be also parallelized and if appropriate computational architecture is used, e.g. GPU, etc. it will lead to faster computation as well. The approach was experimentally verified for scalar and vector data used in the visualization of physical phenomena.

3 Points of Importance

Algorithms developed recently were based on different specific properties of “global” RBFs or “local” compactly supported RBFs (CS-RBFs) and application areas expected, e.g. for interpolation, approximation, solution of partial differential equations, etc., expecting “reasonable” density of points. However, there are still some important problems to be analyzed and hopefully solved, especially:

  • What is an acceptable compromise between the precision of approximation and compression ratio, i.e. reduction of points, if applicable?

  • What is the optimal constant shape parameter, if does exist and how to estimate it efficiently [26]?

  • What are optimal shape parameters \( \alpha \) for every single \( \varphi \left( {r,\alpha } \right) \) [24, 26]?

  • What is the robustness and stability of the RBF for large data and large range span of data with regard to shape parameters [16, 17]?

In this contribution, we will analyze a specific problem related to the first question.

Let us consider given points of a curve (samples of a signal), described by explicit function \( y = f\left( x \right) \). According to the Nyquist-Shannon theorem, the sampling frequency should be at least double the frequency of the highest frequency of the original signal. The idea is, how “points of importance”, i.e. points of inflection and extrema can be used for smooth precise curve approximation.

Let us consider sampled curves in Fig. 2, i.e. a signal without noise (the blue points are values at the borders, red are maxima, the black are inflection and added points. It can be seen that the reconstruction based on radial basis functions (RBF) has to pass:

Fig. 2.
figure 2

Testing functions and resulting approximation based on the points of importance (red points are extrema, black points are additional points of importance) (Color figure online)

  • points at the interval borders

  • points at extremes, maxima and minima

  • some other important points, like points of inflection etc., and perhaps some additional points of the given data to improve signal reconstruction.

However, there several factors to be considered as well, namely:

  • extensibility from 2 D to 3 D for explicit functions of two variables, i.e. \( z = f\left( {x,y} \right) \) and hopefully to higher dimension robustness of computation as given discrete data are given.

For extrema finding, the first derivative \( f^{\prime}\left( x \right) \) is to be replaced by a standard discrete scheme. At the left, resp. right margin, forward, resp. the backward difference is to be used. Inside of the interval, the central difference scheme is recommended, as it also “filters” high frequencies. The simple scheme for the second derivative estimation is shown, too. It can be seen, that this is easily extensible for the 3 D case as well.

$$ \begin{array}{*{20}c} {f^{\prime } \left( x \right) \approx \frac{{f\left( {x_{i + 1} } \right) - f\left( {x_{i} } \right)}}{{x_{i + 1} - x_{i} }}} & {f^{\prime } \left( x \right) \approx \frac{{f\left( {x_{i} } \right) - f\left( {x_{i - 1} } \right)}}{{x_{i} - x_{i - 1} }}} \\ {f^{\prime \prime } \left( x \right) \approx \frac{{\left( {x_{i + 1} } \right) - f\left( {x_{i - 1} } \right)}}{{2\left( {x_{i + 1} - x_{i - 1} } \right)}}\;} & {f^{\prime \prime } \left( x \right) \approx \frac{{f\left( {x_{i + 1} } \right) - 2f\left( {x_{i} } \right) + f\left( {x_{i - 1} } \right)}}{{\left( {x_{i + 1} - x_{i} } \right)\left( {x_{i} - x_{i - 1} } \right)}}} \\ \end{array} $$
(8)

So far, a finding of extrema is a simple task, now. However, due to the discrete data, the extrema is detected by

$$ {\text{sign}}\left( {f\left( {x_{i + 1} } \right) - f\left( {x_{i} } \right)} \right) \ne {\text{sign}}\left( {f\left( {x_{i} } \right) - f\left( {x_{i - 1} } \right)} \right) $$
(9)

as we need to detect the change of the sign, only. This increases the robustness of computation as well. The points of inflections rely on a second derivative, i.e. \( f^{\prime\prime}\left( x \right) = 0 \); a similar condition can be derived from (8).

Now, all the important points, i.e. points at the interval borders, maxima, minima and points of inflection, are detected and found. However, it is necessary to include some more points at the interval borders (at least one on each side) to respect the local behavior of the curve and increase the precision of approximation. It is recommended to include at least one or two points which are closest to the borders to respect a curve behavior at the beginning and end of the interval. Also, if additional points are inserted ideally between extreme and inflection points, the approximation precision increases. Now, the standard RBF interpolation scheme can be applied.

$$ \left[ { \begin{array}{*{20}c} \varvec{B} & \varvec{P} \\ {\varvec{P}^{T} } & \user2{0} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c}\varvec{\lambda}\\ \varvec{a} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} \varvec{f} \\ \user2{0} \\ \end{array} } \right]\;\;\;\;\;\varvec{Ax} = \varvec{b}\;\;\;\;\varvec{a}^{T} \varvec{x}_{\varvec{i}} = a_{x} x_{i} + a_{0} $$
(10)

where: \( \varvec{B} \) represents the RBF submatrix, \( \varvec{\lambda} \) the weights of RBFs, \( \varvec{P} \) represents points for the polynomial \( \varvec{a} \) represents coefficients of the polynomial, \( \varvec{f} \) given function values.

It should be noted, that in the case of scattered data, neighbors for each point are to be found, before the estimation of the derivative is made. In the 2 D case, ordering is possible, in the 3 D case computation is to be made on neighbors found. If the regular sampling in each dimension (along the axis) is given, computation simplifies significantly.

It is necessary to note that the curve reconstruction is at the Nyquist-Shannon theorem boundary and probably limits of the compression were obtained with very low relative error, which is less than \( 0.1 \% \). However, we have many more points available and if a higher precision is needed, the approximation based on Least Square Error (LSE) computational scheme with Lagrange multipliers might be used [11]. The RBF methods usually lead to an ill-conditioned system of linear equations [26]. In the case of approximation, it can be partially improved by geometry algebra in projective space [18, 19] approach.

4 Experimental Results

The presented approach was tested on several testing functions used for evaluation of errors, stability, robustness of computation, see Table 2:

The experiments have also proven, that for large data and data with a large span of data a polynomial \( P_{k} \left( x \right) \) should be \( P_{k} \left( x \right) = a_{0} \), i.e. \( k = 0 \), see [16, 17].

Selected results of the approximation of some functions are presented at Fig. 3. It can be seen, that the proposed approximation based actually on RBF interpolation scheme using points of importance offers good precision of approximation a with good compression ratio. The functions were sampled in 200 points approx. and 10–20 points are actually used for the proposed approximation method.

Fig. 3.
figure 3

Examples of approximation for selected functions.

5 Conclusion

This contribution briefly describes a method for efficient RBF approximation of large scattered data based on finding points of importance. This leads to a simple RBF based approximation of data with relatively low error with high compression. The precision of approximation can be increased significantly by covering some additional points. The approach is easily extensible to the 3D case, especially if data are ordered. However, if data are scattered, the neighbor points must be evaluated to find points of importance.

Experiments proved relatively high precision of approximation based on RFB interpolation using found points of importance leading to high data compression as well.

In future, deep analysis of an approximation behavior at the interval borders is expected as it is a critical issue for the 3D case, i.e. \( z = f\left( {x,y} \right) \), as the first already made experiments shown. Also, the discrete points of curves of inflection are to be taken into account, i.e. discrete points of implicit curves \( F\left( {x,y} \right) = 0 \).