# Finding Points of Importance for Radial Basis Function Approximation of Large Scattered Data

- 148 Downloads

## Abstract

Interpolation and approximation methods are used in many fields such as in engineering as well as other disciplines for various scientific discoveries. If the data domain is formed by scattered data, approximation methods may become very complicated as well as time-consuming. Usually, the given data is tessellated by some method, not necessarily the Delaunay triangulation, to produce triangular or tetrahedral meshes. After that approximation methods can be used to produce the surface. However, it is difficult to ensure the continuity and smoothness of the final interpolant along with all adjacent triangles. In this contribution, a meshless approach is proposed by using radial basis functions (RBFs). It is applicable to explicit functions of two variables and it is suitable for all types of scattered data in general. The key point for the RBF approximation is finding the important points that give a good approximation with high precision to the scattered data. Since the compactly supported RBFs (CSRBF) has limited influence in numerical computation, large data sets can be processed efficiently as well as very fast via some efficient algorithm. The main advantage of the RBF is, that it leads to a solution of a system of linear equations (SLE) **Ax ****= *** b*. Thus any efficient method solves the systems of linear equations that can be used. In this study is we propose a new method of determining the importance points on the scattered data that produces a very good reconstructed surface with higher accuracy while maintaining the smoothness of the surface.

## Keywords

Meshless methods Radial Basis Functions Approximation## 1 Introduction

Interpolation and approximation techniques are used in the solution of many engineering problems. However, the interpolation of unorganized scattered data is still a severe problem. In the one dimensional case, i.e., curves represented as \( y = f\left( x \right) \), it is possible to order points according to the \( x \)-coordinate. However, in a higher dimensionality this is not possible. Therefore, the standard approaches are based on the tessellation of the domain in \( x,y \) or \( x,y,z \) spaces using, e.g. Delaunay triangulation [7], etc. This approach is applicable for static data and \( t \)-varying data, if data in the time domain are “framed”, i.e. given for specific time samples. It also leads to an increase of dimensionality, i.e. from triangulation in \( E^{2} \) to triangulation in \( E^{3} \) or from triangulation in \( E^{3} \) to triangulation in \( E^{4} \), etc. It results in significant increase of the triangulation complexity and complexity of a triangulation algorithm implementation. This is a significant factor influencing computation in the case of large data sets and large range data sets, i.e. when \( x,y,z \) values are spanned over several magnitudes.

RBF interpolation is applicable generally to \( d \)-dimensional problems and does not require tessellation of the definition domain

RBF interpolation and approximation is especially convenient for scattered data interpolation, including interpolation of scattered data in time as well

RBF interpolation is smooth by a definition

RBF interpolation can be applied for interpolation of scalar fields and vector fields as well, which can be used for scalar and vector fields visualization

If the Compactly Supported RBFs (CSRBF) are used, sparse matrix data structures can be used which decreases memory requirements significantly.

there is a real problem for large data sets with robustness and reliability of the RBF application due to high conditionality of the matrix \( \varvec{A} \) of the system of linear equations, which is to be solved

numerical stability and representation is to be applied over a large span of \( x,y,z \) values, i.e. if values are spanned over several magnitudes

problems with memory management as the memory requirements are of \( O\left( {N^{2} } \right) \) complexity, where \( N \) is a number of points in which values are given

the computational complexity of a solution of the linear system, which is \( O\left( {N^{3} } \right) \), resp. \( O\left( {kN^{2} } \right) \), where \( k \) is a number of iteration if the iterative method are used, but \( k \) is relatively high, in general.

Problems with unexpected behavior at geometrical borders

Many contributions are solving some issues of the RBF interpolation and approximation available. Numerical tests are mostly made using some standard testing functions and restricted domain span, mostly taking interval \( \langle 0,1 \rangle \) or similar. However, in many physically based applications, the span of the domain is higher, usually over several magnitudes and large data sets need to be processed. Also large data sets are to be processed.

As the meshless techniques are easily scalable to higher dimensions and can handle spatial scattered data and spatial-temporal data as well, they can be used in many engineering and economical computations, etc. Polygonal representations (tessellated domains) are used in computer graphics and visualization as a surface representation and for surface rendering. In time-varying objects, a surface is represented as a triangular mesh with constant connectivity.

On the other hand, all polygonal based techniques, in the case of scattered data, require tessellations, e.g. Delaunay triangulation with \( O\left( {N^{{\left\lfloor {{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-0pt} 2} + 1} \right\rfloor }} } \right) \) computational complexity for \( N \) points in \( d \)-dimensional space or another tessellation method. However, the complexity of tessellation algorithms implementation grows significantly with dimensionality and severe problems with robustness might be expected, as well.

In the case of data visualization smooth interpolation or approximation on unstructured meshes is required, e.g. on triangular or tetrahedral meshes, when physical phenomena are associated with points, in general. This is quite a difficult task especially if the smoothness of interpolation is needed. However, it is a natural requirement in physically-based problems.

## 2 Meshless Interpolation

Meshless (meshfree) methods are based on the idea of Radial Basis Function (RBF) interpolation [1, 2, 22, 23], which is not separable. RBF based techniques are easily scalable to \( d \)-dimensional space and do not require tessellation of the geometric domain and offer smooth interpolation naturally. In general, meshless techniques lead to a solution of a linear system equations (LS) [4, 5] with a full or sparse matrix.

“implicit” – \( F\left( \varvec{x} \right) = 0 \), i.e. \( F\left( {x,y,z} \right) = 0 \) used in the case of a surface representation in E

^{3}, e.g. surface reconstruction resulting into an implicit function representation. This problem is originated from the implicit function modeling [15] approach,“explicit” – \( F\left( \varvec{x} \right) = h \) used in interpolation or approximation resulting in a functional representation, e.g. a height map in E

^{2}, i.e. \( h = F\left( {x,y} \right) \).

*k*is added [6]. It means that for the given data set \( \left\{ \langle {\varvec{x}_{i} ,h_{i} }\rangle \right\}_{1}^{M} \), where \( h_{i} \) are associated values to be interpolated and \( \varvec{x}_{i} \) are domain coordinates, we obtain a linear system of equations:

^{st}degree is used, i.e. linear polynomial \( P_{1} \left( \varvec{x} \right) = \varvec{a}^{T} \varvec{x} \) in many applications. Therefore, the interpolation function has the form:

*M*is a number of points in the dataset and \( d \) is the dimensionality of data. For \( d = 2 \) vectors \( \varvec{x}_{i} \) and \( \varvec{a} \) are in the form \( \varvec{x}_{i} = \left[ {x_{i} , y_{i} ,1} \right]^{T} \) and \( \varvec{a} = \left[ {a_{x} , a_{y} ,a_{0} } \right]^{T} \), we can write:

For the two-dimensional case and *M* points given a system of \( \left( {M + 3} \right) \) linear equations has to be solved. If “global” functions, e.g. \( \varphi \left( r \right) = r^{2} lg\, r \), are used, then the matrix \( \varvec{B} \) is “full”, if “local” functions CSRBFs are used, the matrix \( \varvec{B} \) can be sparse.

The RBF interpolation was originally introduced by Hardy as the multiquadric method in 1971 [5], which was called Radial Basis Function (RBF) method. Since then many different RFB interpolation schemes have been developed with some specific properties, e.g. 4 uses \( \varphi \left( r \right) = r^{2} lg\, r \), which is called Thin-Plate Spline (TPS), a function \( \varphi \left( r \right) = e^{{ - \left( { \in r} \right)^{2} }} \) was proposed in [23]. However, the shape parameter \( \in \) might leads to an ill-conditioned system of linear equations [26].

Typical examples of “local” functions – CSRBF (“\( + \)” means – value zero out of \( \langle 0,1 \rangle \))

ID | Function | ID | Function |
---|---|---|---|

1 | \( \left( {1 - r} \right)_{ + } \) | 6 | \( \left( {1 - r} \right)_{ + }^{6} \left( {35r^{2} + 18r + 3} \right) \) |

2 | \( \left( {1 - r} \right)_{ + }^{3} \left( {3r + 1} \right) \) | 7 | \( \left( {1 - r} \right)_{ + }^{8} \left( {32r^{3} + 25r^{2} + 8r + 3} \right) \) |

3 | \( \left( {1 - r} \right)_{ + }^{5} \left( {8r^{2} + 5r + 1} \right) \) | 8 | \( \left( {1 - r} \right)_{ + }^{3} \) |

4 | \( \left( {1 - r} \right)_{ + }^{2} \) | 9 | \( \left( {1 - r} \right)_{ + }^{3} \left( {5r + 1} \right) \) |

5 | \( \left( {1 - r} \right)_{ + }^{4} \left( {4r + 1} \right) \) | 10 | \( \left( {1 - r} \right)_{ + }^{7} \left( {16r^{2} + 7r + 1} \right) \) |

Examples of testing functions

ID | Function | ID | Function |
---|---|---|---|

1 | \( y = \sin \left( {15x^{2} + 5x} \right) \) | 2 | \( y = \cos \left( {20x} \right)/2 + 5x \) |

3 | \( y = 50\left( {0.4\, {\text{sin}}\left( {15x^{2} } \right) + 5x} \right) \) | 4 | \( y = \sin \left( {8\pi x} \right) \) |

5 | \( y = \sin \left( {6\pi x^{2} } \right) \) | 6 | \( y = \sin \left( {25x + 0.1} \right)/\left( {25x + 0.1} \right) \) |

7 | \( y = 2\sin \left( {2\pi x} \right) + \sin \left( {4\pi x} \right) \) | 8 | \( y = 2\sin \left( {2\pi x} \right) + \sin \left( {4\pi x} \right) \)\( + \sin \left( {8\pi x} \right) \) |

9 | \( y = 2\sin \left( {\pi \left( {2x - 1} \right)} \right) \)\( + \sin \left( {3\pi \left( {2x - 1/2} \right)} \right) \) | 10 | \( y = 2\sin \left( {\pi \left( {1 - 2x} \right)} \right) \)\( + \sin \left( {3\pi \left( {2x - 1/2} \right)} \right) \) |

11 | \( y = 2\sin \left( {\pi \left( {2x - 1} \right)} \right) \)\( + \sin \left( {3\pi \left( {2x - 1/2} \right)} \right) - x \) | 12 | \( y = 2\sin \left( {2\pi x - \frac{\pi }{2}} \right) \)\( + \sin \left( {3\pi \left( {2x - 1/2} \right)} \right) \) |

13 | \( y = {\text{atan}}\left( {10x - 5} \right)^{3} \)\( +\, {\text{atan}}\left( {10x - 8} \right)^{3} /2 \) | 14 | \( y = \left( {4.88x - 1.88} \right)\,* \) \( \sin \left( {4.88x - 1.88} \right)^{2} + 1 \) |

15 | \( y = \exp \left( {10x - 6} \right) \,* \) \( \sin \left( {5x - 2} \right)^{3} + \left( {3x - 1} \right)^{3} \) | 16 | \( y = \tanh \left( {9x + 1/2} \right)/9 \) |

The compactly supported RBFs are defined for the “normalized” interval \( r \in 0,1 \), but for the practical use a scaling is used, i.e. the value \( r \) is multiplied by shape parameter \( \alpha \), where \( \alpha > 0 \).

Meshless techniques are primarily based on the approaches mentioned above. They are used in engineering problem solutions, nowadays, e.g. partial differential equations, surface modeling, surface reconstruction of scanned objects [13, 14], reconstruction of corrupted images [21], etc. More generally, meshless object representation is based on specific interpolation or approximation techniques [1, 6, 23].

The resulting matrix \( \varvec{A} \) tends to be large and ill-conditioned. Therefore, some specific numerical methods have to be taken to increase the robustness of a solution, like preconditioning methods or parallel computing on GPU [9, 10], etc. In addition, subdivision or hierarchical methods are used to decrease the sizes of computations and increase robustness [15, 16, 27].

It should be noted, that the *computational complexity* of meshless methods actually covers the complexity of tessellation itself and interpolation and approximation methods. This results in problems with large data set processing, i.e. numerical stability and memory requirements, etc.

If global RBF functions are considered, the RBF matrix is full and in the case of \( 10^{6} \) of points, the RBF matrix is of the size approx. \( 10^{6} \times 10^{6} \) ! On the other hand, if CSRBF used, the relevant matrix is sparse and computational and memory requirements are decreased significantly using special data structures [8, 10, 20, 27].

In the case of physical phenomena visualization, data received by simulation, computation or obtained by experiments usually are oversampled in some areas and also numerically more or less precise. It seems possible to apply approximation methods to decrease computational complexity significantly by adding virtual points in the place of interest and use analogy of the least square method modified for the RBF case [3, 12, 17, 25].

Due to the CSRBF representation the space of data can be subdivided, interpolation, resp. the approximation can be split to independent parts and computed more or less independently [20]. This process can be also parallelized and if appropriate computational architecture is used, e.g. GPU, etc. it will lead to faster computation as well. The approach was experimentally verified for scalar and vector data used in the visualization of physical phenomena.

## 3 Points of Importance

What is an acceptable compromise between the precision of approximation and compression ratio, i.e. reduction of points, if applicable?

What is the optimal constant shape parameter, if does exist and how to estimate it efficiently [26]?

What are optimal shape parameters \( \alpha \) for every single \( \varphi \left( {r,\alpha } \right) \) [24, 26]?

What is the robustness and stability of the RBF for large data and large range span of data with regard to shape parameters [16, 17]?

Let us consider given points of a curve (samples of a signal), described by explicit function \( y = f\left( x \right) \). According to the Nyquist-Shannon theorem, the sampling frequency should be at least double the frequency of the highest frequency of the original signal. The idea is, how “points of importance”, i.e. points of inflection and extrema can be used for smooth precise curve approximation.

points at the interval borders

points at extremes, maxima and minima

some other important points, like points of inflection etc., and perhaps some additional points of the given data to improve signal reconstruction.

extensibility from 2 D to 3 D for explicit functions of two variables, i.e. \( z = f\left( {x,y} \right) \) and hopefully to higher dimension robustness of computation as given discrete data are given.

It should be noted, that in the case of scattered data, neighbors for each point are to be found, before the estimation of the derivative is made. In the 2 D case, ordering is possible, in the 3 D case computation is to be made on neighbors found. If the regular sampling in each dimension (along the axis) is given, computation simplifies significantly.

It is necessary to note that the curve reconstruction is at the Nyquist-Shannon theorem boundary and probably limits of the compression were obtained with very low relative error, which is less than \( 0.1 \% \). However, we have many more points available and if a higher precision is needed, the approximation based on Least Square Error (LSE) computational scheme with Lagrange multipliers might be used [11]. The RBF methods usually lead to an ill-conditioned system of linear equations [26]. In the case of approximation, it can be partially improved by geometry algebra in projective space [18, 19] approach.

## 4 Experimental Results

The presented approach was tested on several testing functions used for evaluation of errors, stability, robustness of computation, see Table 2:

The experiments have also proven, that for large data and data with a large span of data a polynomial \( P_{k} \left( x \right) \) should be \( P_{k} \left( x \right) = a_{0} \), i.e. \( k = 0 \), see [16, 17].

## 5 Conclusion

This contribution briefly describes a method for efficient RBF approximation of large scattered data based on finding points of importance. This leads to a simple RBF based approximation of data with relatively low error with high compression. The precision of approximation can be increased significantly by covering some additional points. The approach is easily extensible to the 3D case, especially if data are ordered. However, if data are scattered, the neighbor points must be evaluated to find points of importance.

Experiments proved relatively high precision of approximation based on RFB interpolation using found points of importance leading to high data compression as well.

In future, deep analysis of an approximation behavior at the interval borders is expected as it is a critical issue for the 3D case, i.e. \( z = f\left( {x,y} \right) \), as the first already made experiments shown. Also, the discrete points of curves of inflection are to be taken into account, i.e. discrete points of implicit curves \( F\left( {x,y} \right) = 0 \).

## Notes

### Acknowledgments

The authors would like to thank their colleagues and students at the University of West Bohemia and Universiti Teknologi PETRONAS for their discussions and suggestions; especially to Michal Smolik, Zuzana Majdisova and Jakub Vasta from the University of West Bohemia. Thanks belong also to anonymous reviewers for their valuable comments and hints provided.

This research was supported by the Czech Science Foundation (GACR) project GA 17-05534S and partially by SGS 2019-016.

## References

- 1.Biancolini, M.E.: Fast Radial Basis Functions for Engineering Applications. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-75011-8CrossRefzbMATHGoogle Scholar
- 2.Buhmann, M.D.: Radial Basis Functions: Theory and Implementations. Cambridge University Press, Cambridge (2008)zbMATHGoogle Scholar
- 3.Cervenka, M., Smolik, M., Skala, V.: A new strategy for scattered data approximation using radial basis functions respecting points of inflection. In: Misra, S., et al. (eds.) ICCSA 2019. LNCS, vol. 11619, pp. 322–336. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24289-3_24CrossRefGoogle Scholar
- 4.Duchon, J.: Splines minimizing rotation-invariant semi-norms in Sobolev space. In: Schempp, W., Zeller, K. (eds.) Constructive Theory of Functions of Several Variables. LNCS, vol. 571. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0086566CrossRefGoogle Scholar
- 5.Hardy, L.R.: Multiquadric equation of topography and other irregular surfaces. J. Geophys. Res.
**76**(8), 1905–1915 (1971)CrossRefGoogle Scholar - 6.Fasshauer, G.E.: Meshfree Approximation Methods with MATLAB. World Scientific Publishing, Singapore (2007)CrossRefGoogle Scholar
- 7.Karim, S.A.A., Saaban, A., Skala, V.: Range-restricted interpolation using rational bi-cubic spline functions with 12 parameters.
**7**, 104992–105006 (2019). SSN: 2169-3536. https://doi.org/10.1109/access.2019.2931454 - 8.Majdisova, Z., Skala, V.: A new radial basis function approximation with reproduction. In: CGVCVIP 2016, Portugal, pp. 215–222 (2016). ISBN 978-989-8533-52-4Google Scholar
- 9.Majdisova, Z., Skala, V.: Radial basis function approximations: comparison and applications. Appl. Math. Model.
**51**, 728–743 (2017). https://doi.org/10.1016/j.apm.2017.07.033MathSciNetCrossRefzbMATHGoogle Scholar - 10.Majdisova, Z., Skala, V.: Big geo data surface approximation using radial basis functions: a comparative study. Comput. Geosci.
**109**, 51–58 (2017). https://doi.org/10.1016/j.cageo.2017.08.007CrossRefGoogle Scholar - 11.Majdisova, Z., Skala, V., Smolik, M.: Determination of stationary points and their bindings in dataset using RBF methods. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) CoMeSySo 2018. AISC, vol. 859, pp. 213–224. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00211-4_20CrossRefGoogle Scholar
- 12.Majdisova, Z., Skala, V., Smolik, M.: Determination of reference points and variable shape parameter for RBF approximation. Integr. Comput.-Aided Eng.
**27**(1), 1–15 (2020). https://doi.org/10.3233/ICA-190610. ISSN 1069-2509CrossRefGoogle Scholar - 13.Pan, R., Skala, V.: A two level approach to implicit modeling with compactly supported radial basis functions. Eng. Comput.
**27**(3), 299–307 (2011). https://doi.org/10.1007/s00366-010-0199-1. ISSN 0177-0667CrossRefGoogle Scholar - 14.Pan, R., Skala, V.: Surface reconstruction with higher-order smoothness. Vis. Comput.
**28**(2), 155–162 (2012). https://doi.org/10.1007/s00371-011-0604-9. ISSN 0178-2789CrossRefGoogle Scholar - 15.Ohtake, Y., Belyaev, A., Seidel, H.-P.: A multi-scale approach to 3D scattered data interpolation with compactly supported basis functions. In: Shape Modeling, pp. 153–161. IEEE, Washington (2003). https://doi.org/10.1109/smi.2003.1199611
- 16.Skala, V.: RBF interpolation with CSRBF of large data sets, ICCS 2017. Procedia Comput. Sci.
**108**, 2433–2437 (2017). https://doi.org/10.1016/j.procs.2017.05.081CrossRefGoogle Scholar - 17.Skala, V.: RBF interpolation and approximation of large span data sets. In: MCSI 2017 – Corfu, pp. 212–218. IEEE (2018). https://doi.org/10.1109/mcsi.2017.44
- 18.Skala, V., Karim, S.A.A., Kadir, E.A.: Scientific computing and computer graphics with GPU: application of projective geometry and principle of duality. Int. J. Math. Comput. Sci.
**15**(3), 769–777 (2020). ISSN 1814-0432Google Scholar - 19.Skala, V.: High dimensional and large span data least square error: numerical stability and conditionality. Int. J. Appl. Phys. Math.
**7**(3), 148–156 (2017). https://doi.org/10.17706/ijapm.2017.7.3.148-156. ISSN 2010-362XCrossRefGoogle Scholar - 20.Smolik, M., Skala, V.: Large scattered data interpolation with radial basis functions and space subdivision. Integr. Comput.-Aided Eng.
**25**(1), 49–62 (2018). https://doi.org/10.3233/ica-170556CrossRefGoogle Scholar - 21.Uhlir, K., Skala, V.: Reconstruction of damaged images using radial basis functions. In: EUSIPCO 2005 Conference Proceedings, Turkey (2005). ISBN 975-00188-0-XGoogle Scholar
- 22.Wenland, H.: Scattered Data Approximation. Cambridge University Press (2010). http://doi.org/10.1017/CBO9780511617539
- 23.Wright, G.B.: Radial basis function interpolation: numerical and analytical developments. Ph.D. thesis, University of Colorado, Boulder (2003)Google Scholar
- 24.Skala, V., Karim, S.A.A., Zabran, M.: Radial basis function approximation optimal shape parameters estimation: preliminary experimental results. In: ICCS 2020 Conference (2020)Google Scholar
- 25.Vasta, J., Skala, V., Smolik, M., Cervenka, M.: Modified radial basis functions approximation respecting data local features. In: Informatics 2019, IEEE Proceedings, Poprad, Slovakia, pp. 445–449 (2019). ISBN 978-1-7281-3178-8Google Scholar
- 26.Cervenka, M., Skala, V.: Conditionality analysis of the radial basis function matrix. In: International Conference on Computational Science and Applications ICCSA (2020)Google Scholar
- 27.Smolik, M., Skala, V.: Efficient speed-up of radial basis functions approximation and interpolation formula evaluation. In: International Conference on Computational Science and Applications ICCSA (2020)Google Scholar