Advertisement

Theory and Realization of Reference Systems

  • Athanasios DermanisEmail author
Living reference work entry
Part of the Springer Reference Naturwissenschaften book series (SRN)

Abstract

After a short introduction on the basics of reference system theory and its application for the description of earth rotation, the problem of establishing a reference system for the discrete stations of a geodetic network is studied, from both a theoretical and a practical – implementation point of view.

First the case of rigid networks is examined, which covers also the case of deformable networks with data collected within a time span, small enough for the network shape to remain practically unaltered. The problem of how to analyze observations, which are invariant under particular changes of the reference system, is examined within the framework of least squares estimation theory, with a rank deficiency in the design matrix. The complete theory is presented, including all necessary proofs. Not only of the usual statistical results for the rank deficient linear Gauss-Markov model, but also those of the rich geodetic theory are presented, based on the fact that the physical cause of the rank deficiency is known to be the lack of definition of the reference system. The additional geodetic results are based on the fact that one can easily construct a matrix with columns that are a basis of the null space of the design matrix. Insights are presented into the geometric characteristics of the problem and its relation to the theory of generalized inverses. Passing into deformable networks, a deterministic mathematical model is presented, based of the concept of geodesic lines which are the shortest between linear shape manifolds, associated with the network shape at each instant. Reference system optimality for a discrete network is related to the relevant ideas of Tisserand, developed for the continuum of the earth masses.

The practical problem of choosing a reference system for a coordinate time series is examined, for the case where a linear-in-time model is adopted for the temporal variation of coordinates. The choice of reference system is related to the choice of minimal constraints for obtaining one out of the infinitely many least squares solutions, corresponding to descriptions in different reference systems of the same sequence of network shapes. The a-posteriori change of the reference system is examined, where one moves from one least squares solution to another one, satisfying particular minimal constraints. Kinematic minimal constraints are also introduced, leading to coordinates that demonstrate the minimum coordinate variation and are thus connected to the ideas of Tisserand for reference system optimality. It is also shown how to convert a reference system of a geodetic network to one for the whole continuous earth, or at least the lithosphere, utilizing additional geophysical information.

The last item is the combination of data from four space techniques (VLBI, SLR, GPS, DORIS) in order to establish a global reference system realized though a number of parameters that constitute the International Terrestrial Reference Frame. After a theoretical exposition of the basics of data combination, the various methods of spatial data combination are presented, for both coordinate and Earth Orientation Parameter time series, while alternatives are presented for the choice of the origin (geocenter) and the network scale from the scale of VLBI and SLR. Finally, existing and new methodologies are presented for building post linear models, describing the temporal variation of station coordinates.

Keywords

Reference systems Rank deficient linear model Minimal constraints Inner constraints Kinematic constraints Tisserand reference system Coordinate tine series Earth orientation parameters Combination of geodetic space techniques International Terrestrial Reference Frame (ITRF) 

1 Introduction

Applied mathematics aim at the description of physical processes by means of mathematical equations. In order to achieve this goal, physical entities, such as points, time instants, and scalar, vector or tensor quantities, must be represented by real numbers. In a more general relativistic set up, where space-time is considered to be a curved manifold, points are converted to numbers with the use of a coordinate system, namely a one-to-one correspondence between events (point plus epoch) and a tetrad of numbers, three for the point and one for the time instant. As geodesists well know from the use of geodetic longitude and latitude, in order to describe points on the earth surface, such a one-to-one correspondence is generally impossible to achieve: the poles have a unique latitude but they correspond to any value of the geodetic longitude. In modern differential geodesy, this problem is bypassed by means of an atlas, which is a collection of coordinate systems called charts, each chart covering only an open subset of the manifold.

When a curved manifold is embedded in a flat space, e.g., a two dimensional surface embedded in the Euclidean three-dimensional space, vectors can be viewed in the usual way, as tangent to the manifold, living within the surrounding flat space and not within the manifold. This cannot be achieved though, when a curved manifold is considered by itself, without any embedding into a flat space. To overcome this problem, modern differential geometry replaces vectors with directional derivatives, i.e., derivatives along all curves passing through the point considered, which are tangent to each other and have the same rate of displacement with respect to their parametric representation.

Fortunately, these mathematical complications can be avoided in geodesy, thanks to the implementation of the Newtonian-Euclidean model for space-time, which separates time from space and allows the possibility of parallel translation within the flat three-dimensional space. Although general relativity is not irrelevant to modern geodetic observations, it is customary to perform “relativistic corrections” on them, which allow their further analysis within a Newtonian-Euclidean model.

A reference system for time consists of a particular time instance called the time origin, a time interval serving as the unit of time and a direction, which is necessarily the one from the past towards the future. This allows the representation of any time instant by a number, namely the ratio of the time interval between the instant and the time origin to the time unit, with a positive sign when the instant occurs later than the time origin and negative otherwise.

A reference system within the Euclidean three-dimensional space serves two purposes at the same time: it represents points by three numbers, their Cartesian coordinates, and local vectors by three numbers, their components with respect to a local set of three base vectors. A reference system consists of a particular point O, called the origin, three directed non-coplanar straight lines passing through the origin, called the axes, and a line segment serving as the unit of length. Alternatively, we may replace the axes and the unit of length by three non-coplanar vectors at the origin, visualized as directed line segments (arrows!), having length equal to the length unit and directions those of the axes pointing toward their positive sense. Thus a reference system\((O, \vec {e}_1 , \vec {e}_2 , \vec {e}_3 )\) consists of the origin O and the vector basis\(\vec {e}_1 \), \(\vec {e}_2 \), \(\vec {e}_3 \). Following the usual practice, we will assume hereon that the three base vectors are perpendicular to each other, a choice which greatly simplifies, and thus facilitates, relevant computations. It is also assumed that the base vectors form a right-handed triad, which means that looking from \(\vec {e}_3 \), \(\vec {e}_1 \) appears to be on the right with respect to \(\vec {e}_2 \). With this choice, we will speak of a Cartesian reference system and Cartesian coordinates. For any other point P the directed line segment \(\vec {x} = \vec {OP}\) serves as the position vector of P and can be expressed as a unique linear combination of the base vectors \(\vec {x} = x^1\vec {e}_1 + x^2\vec {e}_2 + x^3\vec {e}_3 \). The three components x1, x2, x3 of the position vector serve as the Cartesian coordinates of the point P. A local vector \(\vec {v}\) at any point P can be represented by its components v1, v2, v3 with respect to a local basis\(\vec {e}_1 (P)\), \(\vec {e}_2 (P)\), \(\vec {e}_3 (P)\), which results from the parallel transport of the reference system basis \(\vec {e}_1 (O)\), \(\vec {e}_2 (O)\), \(\vec {e}_3 (O)\) from the origin O to the point P. The vector components are the coefficients in the linear combination \(\vec {v}(P) = v^1\vec {e}_1 (P) + v^2\vec {e}_2 (P) + v^3\vec {e}_3 (P)\). We will use here matrix notation by setting
$$\displaystyle \begin{aligned} \vec{\mathbf{e}} = \left[\vec{e}_1 \ \vec{e}_2 \ \vec{e}_3\right],\quad \mathbf{x} = \left[ \begin{array}{c} {x^1} \\ {x^2} \\ {x^3} \end{array} \right],\qquad \mathbf{v} = \left[ \begin{array}{c} {v^1} \\ {v^2} \\ {v^3} \end{array} \right], \end{aligned} $$
(1)
which allows us to write
$$\displaystyle \begin{aligned} \vec{x} &= x^1\vec{e}_1 + x^2\vec{e}_2 + x^3\vec{e}_3 = \left[\vec{e}_1 \ \vec{e}_2 \ \vec{e}_3 \right]\left[ \begin{array}{c} {x^1} \\ {x^2} \\ {x^3} \end{array} \right] = {\vec{\mathbf{e}}\mathbf{x}}, \end{aligned} $$
(2)
$$\displaystyle \begin{aligned} \vec{v}& = v^1\vec{e}_1 + v^2\vec{e}_2 + v^3\vec{e}_3 = \left[\vec{e}_1 \ \vec{e}_2 \ \vec{e}_3 \right]\left[ \begin{array}{c} {v^1} \\ {v^2} \\ {v^3} \end{array} \right] = {\vec{\mathbf{e}}\mathbf{v}}, \end{aligned} $$
(3)
omitting the dependence of the basis from the relevant point O or P, since it is clear from the context. We will also make extensive use of the of the antisymmetric matrix [a×] with axial vector (column matrix) a, defined as
$$\displaystyle \begin{aligned}{}[\mathbf{a}\times ] = \left[ \begin{array}{ccc} 0 & { - a_3 } & {a_2 } \\ {a_3 } & 0 & { - a_1 } \\ { - a_2 } & {a_1 } & 0 \end{array} \right],\qquad \mathbf{a} = \left[ \begin{array}{c} {a_1 } \\ {a_2 } \\ {a_3 } \end{array} \right], \end{aligned} $$
(4)
which allows us to express the exterior vector product\(\vec {c} = \vec {a}\times \vec {b}\) through the matrix expression c = [a×]b. We will also make use of the relation
$$\displaystyle \begin{aligned}{}[(\mathbf{Qa})\times ] = \mathbf{Q}[\mathbf{a}\times ]{\mathbf{Q}}^T, \end{aligned} $$
(5)
which is valid for any proper orthogonal matrix Q (QTQ = QQT = I, \(\det \mathbf {Q} = + 1\)). We will also use dots for expressing derivatives with respect to time, e.g., \(\dot {p} \equiv \frac {dp}{dt}\).

2 Reference Systems in Motion: Generalized Euler Kinematic Equations – The Rotation Vector Concept

For the analysis of modern geodetic observations carried out by space techniques such as GPS (GNSS), VLBI, SLR and DORIS, it is necessary to implement at least two reference systems. An inertial celestial reference system for the description of satellite orbits and radio source directions, and a terrestrial reference system, which represents the deforming earth in the best possible way. The motion of the earth can be thus separated into three parts:
  1. (a)

    translational motion of its origin, ideally chosen as the geocenter (earth mass center), with respect to the inertial space, i.e., the orbit of the earth around the sun as studied by celestial mechanics.

     
  2. (b)

    rotation of the earth (more precisely of the chosen terrestrial reference system) with respect to the celestial reference system, as studied by geodesists and astronomers.

     
  3. (c)

    deformation of the earth, i.e., the motion of its masses with respect to the chosen terrestrial reference system.

     
For the analysis of observations within a relatively small time interval, say one day, the curvature of the orbit of the earth can be practically ignored and thus assume that the earth moves along a straight line. Thus a reference system having the same origin as the terrestrial reference system but axes parallel to those of the celestial one, can be considered as a quasi-inertial reference system. This choice allows us to leave aside the translational motion and study only the rotation of the terrestrial reference system with respect to the quasi-inertial celestial reference system.
In a more general context we study the rotation of a rotating reference system\((O,\vec {\mathbf {e}}(t))\) with respect to a non-rotating one \((O,\vec {\mathbf {e}}^0)\). The bases of the two systems will be related by
$$\displaystyle \begin{aligned} \vec{\mathbf{e}}(t) = \vec{\mathbf{e}}^0{\mathbf{R}}^T(t), \end{aligned} $$
(6)
where R(t) is a time dependent orthogonal matrix, called the rotation matrix, while \(\vec {\mathbf {e}}^0 = \vec {\mathbf {e}}(t)\mathbf {R}(t)\) is the corresponding inverse relation. Omitting the dependence on time for the sake of simplicity, it follows from the fact that \(\vec {x} = {\mathbf {e}}^0{\mathbf {x}}_0 = \mathbf {ex} = {\mathbf {e}}^0{\mathbf {R}}^T\mathbf {x}\), x0 = RTx and thus the coordinates in the two systems will be related by
$$\displaystyle \begin{aligned} \mathbf{x} = \mathbf{Rx}_0 . \end{aligned} $$
(7)
Note that we have chosen the use of the transpose RT in the transformation of the bases (6) so that the rotation matrix R appears in the more applicable coordinate transformation (7).
The time rate of the rotating basis follows by differentiating (6) to obtain \(\frac {d\vec {\mathbf {e}}}{dt} =\vec {\mathbf {e}}^0\frac {d{\mathbf {R}}^T}{dt} = \vec {\mathbf {e}}\mathbf {R}\frac {d{\mathbf {R}}^T}{dt}\). It is easy to verify by differentiating the relation RRT = I that the matrix \(\mathbf {R}\frac {d{\mathbf {R}}^T}{dt}\) is antisymmetric and denoting by ω its axial vector we set
$$\displaystyle \begin{aligned} \mathbf{R}\frac{d{\mathbf{R}}^T}{dt} = [\boldsymbol{\upomega }\times ], \end{aligned} $$
(8)
and thus
$$\displaystyle \begin{aligned} \frac{d\vec{\mathbf{e}}}{dt} = \vec{\mathbf{e}}[\boldsymbol{\upomega }\times ].\end{aligned} $$
(9)
The relations (8) are the generalized kinematic Euler equations. Their specific form depends on the particular representation of the rotation matrix R. For the representation through Euler anglesR = R1(φ)R3(θ)R1(ψ), where Rj(θj) represents a rotation around the axis j by an angle θj, one obtains the usual kinematic Euler equations appearing in texts. We will give here the ones corresponding to the usual geodetic choice of Cardan anglesR = R3(θ3)R2(θ2)R1(θ1):
$$\displaystyle \begin{aligned} \omega^1 &= \sin \theta_3 \frac{d\theta_2 }{dt} + \cos \theta_3 \cos \theta_2 \frac{d\theta_3 }{dt}, \\ \omega^2 &= \cos \theta_3 \frac{d\theta_2 }{dt} - \sin \theta_3 \cos \theta_2 \frac{d\theta_3 }{dt}, \\ \omega^3 &= \frac{d\theta_1 }{dt} + \sin \theta_2 \frac{d\theta_3 }{dt}. {} \end{aligned} $$
(10)
The vector \(\vec {\omega } = \vec {\mathbf {e}}\boldsymbol {\upomega } = \vec {\mathbf {e}}^0\boldsymbol {\upomega }_0 \) with components ω in the rotating system and ω0 = RTω in the non-rotating one, is no other than the rotation vector of the rotating reference system, which is defined as the vector \(\vec {\omega } = \omega \,\vec {n}\), where \(\omega = \vert \vec {\omega }\vert \) is the instantaneous angular velocity and \(\vec {n}\) the unit vector in the direction of the instantaneous axis of rotation. The latter can be defined as follows: The transition from the position of the axes \(\vec {\mathbf {e}}(t)\) at epoch t to their position \(\vec {\mathbf {e}}(t + \Delta t)\) at a later epoch t + Δt, can be achieved by a single rotation around an axis \(\vec {n}(t,t + \Delta t)\) by an angle Δθ where ω(t, t + Δt) = Δθ∕ Δt is the mean angular velocity of this rotation. \(\vec {n}\) and ω are the respective limits of \(\vec {n}(t,t + \Delta t)\) and ω(t, t + Δt) as Δt → 0. An alternative form of the kinematic Euler equations follows by replacing ω = Rω0 which gives \(\mathbf {R}\frac {d{\mathbf {R}}^T}{dt} = [(\mathbf {R}\boldsymbol {\upomega }_0 )\times ] = \mathbf {R}[\boldsymbol {\upomega }_0 \times ]{\mathbf {R}}^T\) and hence
$$\displaystyle \begin{aligned}{}[\boldsymbol{\upomega}_0 \times ] = \frac{d{\mathbf{R}}^T}{dt}\mathbf{R}. \end{aligned} $$
(11)
Any vector \(\vec {z} = \vec {\mathbf {e}}^0{\mathbf {z}}_0 = \vec {\mathbf {e}}\mathbf {z}\) has time derivative \(\frac {d\vec {z}}{dt} = \vec {\mathbf {e}}^0\frac {d{\mathbf {z}}_0 }{dt} = \frac {d\vec {\mathbf {e}}}{dt}\mathbf {z} + \vec {\mathbf {e}}\frac {d\mathbf {z}}{dt} =\)\(\vec {\mathbf {e}}\left ( {[\boldsymbol {\upomega }\times ]\mathbf {z} + \frac {d\mathbf {z}}{dt}} \right )\). Applying this to the velocity \(\vec {v} = \frac {d\vec {x}}{dt} = \vec {\mathbf {e}}\mathbf {v}\) and the acceleration \(\vec {a} = \frac {d\vec {v}}{dt} = \vec {\mathbf {e}}{\mathbf {a}} = \vec {\mathbf {e}}^0{\mathbf {a}}_0 \), it follows that their components in the rotating system are \(\mathbf {v} = [\boldsymbol {\upomega }\times ]\mathbf {x} + \frac {d\mathbf {x}}{dt}\) and \(\mathbf {a} = [\boldsymbol {\upomega }\times ]\mathbf {v} + \frac {d\mathbf {v}}{dt}\), which combined give
$$\displaystyle \begin{aligned} \mathbf{a} = \frac{d^2\mathbf{x}}{dt^2} + [\boldsymbol{\upomega }\times ]^2\mathbf{x} + 2[\boldsymbol{\upomega }\times ]\frac{d\mathbf{x}}{dt} + [\frac{d\boldsymbol{\upomega }}{dt}\times ]\mathbf{x}.\end{aligned} $$
(12)
Newton’s second law of dynamics \(\vec {a} \equiv \frac {d^2\vec {x}}{dt^2} = \vec {f}\), where \(\vec {f} = {\vec {\mathbf e}}{\mathbf {f}} = \vec {\mathbf {e}}^0{\mathbf {f}}_0 \) are the applied forces per unit mass, is represented by \({\mathbf {a}}_0 = \frac {d^2{\mathbf {x}}_{0}}{dt^2} = {\mathbf {f}}_0 \) in the non-rotating system and by a = f in the rotating one, which in view of (12) becomes
$$\displaystyle \begin{aligned} \frac{d^2\mathbf{x}}{dt^2} = \mathbf{f} - [\boldsymbol{\upomega }\times ]^2\mathbf{x} - 2[\boldsymbol{\upomega }\times ]\frac{d\mathbf{x}}{dt} - [\frac{d\boldsymbol{\upomega }}{dt}\times ]\mathbf{x}.\end{aligned} $$
(13)
This means that the apparent acceleration \(\frac {d^2\mathbf {x}}{dt^2}\), as seen within the rotating system, depends in addition to the actual applied forces on three pseudo-forces: the centrifugal forcep = −[ω×]2x = −(ωTx)ω − (ωTω)x, the Coriolis force\(\mathbf {q} = - 2[\boldsymbol {\upomega }\times ]\frac {d\mathbf {x}}{dt}\) and the gyroscopic force\(\mathbf {g} = - [\frac {d\boldsymbol {\upomega }}{dt}\times ]\mathbf {x}\). The Coriolis force is exerted only on bodies which are moving with respect to the rotating system, i.e., when \(\frac {d\mathbf {x}}{dt} \ne \mathbf {0}\), while the gyroscopic force appears only when the rotation vector changes, either direction with respect to the rotating system, or magnitude. For the rotating terrestrial reference system, the Coriolis force is exerted only on bodies moving with respect to the earth, while the gyroscopic force is very small, as it depends on polar motion and variations in the angular velocity of the earth (variations in the length of the day), which are both physical phenomena of small magnitude.

3 Reference Systems for the Description of Earth Rotation

In order to describe the rotation of the terrestrial reference system with respect to the quasi-inertial celestial reference system, we must take advantage of the fact that it is dominated by the diurnal rotation. Indeed the major part of earth rotation takes place around an axis with slowly varying position with respect to the celestial system (precession and notation), as well as with respect to the earth (polar motion), and with little varying rotational velocity of about 2π per day. In view of the dominant role of diurnal rotation, it is wise to separate large from small parts of “earth rotation” by introducing two intermediate reference systems with their third axis aligned to the rotation axis, one that rotates with the earth and one that it does not. Denoting the terrestrial reference system by \(\vec {\mathbf {e}}^T\) and the celestial one by \(\vec {\mathbf {e}}^C\), the two new reference systems are the intermediate terrestrial one \(\vec {\mathbf {e}}^{IT}\) and the intermediate celestial one \(\vec {\mathbf {e}}^{IC}\), where \(\vec {e}_3^{IT} = \vec {e}_3^{IC} = \vec {n}\), with \(\vec {n} = \frac {1}{\omega }\vec {\omega }\) being the unit vector in the direction of the rotation vector. If xT, xC, xIT, xIC are coordinates of any point in the respective reference systems \(\vec {\mathbf {e}}^T\), \(\vec {\mathbf {e}}^C\), \(\vec {\mathbf {e}}^{IT}\), \(\vec {\mathbf {e}}^{IC}\), the overall rotation transformation xT = RxC can be analyzed into three parts
$$\displaystyle \begin{aligned} {\mathbf{x}}_T = \mathbf{Rx}_C = \mathbf{WR}_3 (\theta )\mathbf{Qx}_C. \end{aligned} $$
(14)
The matrix Q of the transformation xIC = QxC from the celestial to the intermediate celestial reference system represents the phenomenon of precession and nutation, i.e., the variation of the rotation axis direction \(\vec {n} = \vec {e}_3^{IC} \) with respect to the celestial reference system. The matrix WT of the transformation xIT = WTxT from the terrestrial to the intermediate terrestrial reference system represents the phenomenon of polar motion, i.e., the variation of the rotation axis direction \(\vec {n} = \vec {e}_3^{IT} \) with respect to the terrestrial reference system. Finally R3(θ) is the matrix of diurnal rotationxIT = R3(θ)xIC around the rotation axis \(\vec {n} = \omega ^{ - 1}\vec {\omega }\), where θ is the earth rotation angle. The explicit representation of the precession-nutation and polar motion matrices are
$$\displaystyle \begin{aligned} \mathbf{Q} = {\mathbf{R}}_3 ( - s){\mathbf{R}}_3 ( - E){\mathbf{R}}_2 (d){\mathbf{R}}_3 (E),\quad {\mathbf{W}}^T = {\mathbf{R}}_3 ( - {s}^{\prime}){\mathbf{R}}_3 ( - F){\mathbf{R}}_2 (g){\mathbf{R}}_3 (F). \end{aligned} $$
(15)
Within the precession-nutation matrix, the rotation R3(E) brings the second axis in a position perpendicular to the plane of \(\vec {e}_C^3 \) and \(\vec {n}\), while the following rotation R2(d) aligns the third axis to the \(\vec {n} = \vec {e}_3^{IC} \) direction. The following rotation R3(−E) merely brings the first and second axes closer to their original positions. The remaining rotation R3(−s) serves to bring the \(\vec {e}_1^{IC} \) axis in its desired position, through an appropriate choice of the tuning angle s, so that it does not follow the earth in its diurnal rotation. Completely analogous is the situation within the transpose of the polar motion matrix W, where the rotation R3(−s) serves to bring the \(\vec {e}_1^{IT} \) axis in its desired position, through an appropriate choice of the tuning angle s, so that it does follow the earth in its diurnal rotation.
In order to proceed with the choice of the tuning angles s and s, so that the definition of the two intermediate reference systems is completed, we must give a precise mathematical meaning to the expressions “follows the earth in its diurnal rotation” and “does not follow the earth in its diurnal rotation”. This is achieved with the introduction of the relative rotation vector concept between any two reference systems, in analogy to the rotation vector of a rotating reference system with respect to a non-rotating one. Let \(\vec {\mathbf {e}}^A\), \(\vec {\mathbf {e}}^B\) be two reference systems related by \(\vec {\mathbf {e}}^B = \vec {\mathbf {e}}^A{\mathbf {R}}_{A \to B}^T \). Then in analogy to Eq. (8) we may define the relative rotation vector ωAB through the relation
$$\displaystyle \begin{aligned}{}[\boldsymbol{\upomega}_{A \to B} \times ]{\mathbf{R}}_{A \to B} \frac{d{\mathbf{R}}_{A \to B}^T }{dt}. \end{aligned} $$
(16)
Applying this definition to the precession-nutation and the polar motion matrix we may define the relative rotation vectors ωQ = ωCIC and ωW = ωTIT as
$$\displaystyle \begin{aligned}{}[\boldsymbol{\upomega}_Q \times ] = \mathbf{Q}\frac{d{\mathbf{Q}}^T}{dt},\ [\boldsymbol{\upomega}_W \times ] = {\mathbf{W}}^T\frac{d\mathbf{W}}{dt}. \end{aligned} $$
(17)
The intermediate celestial reference system does not follow the earth in its diurnal rotation when the relative rotation vector ωQ = ωCIC of precession-nutation has no component along the rotation axis \(\vec {n} = \vec {c}_3^{IC} \), i.e., when \({\omega }_Q^3 = 0\). The intermediate terrestrial reference system follows the earth in its diurnal rotation when the relative rotation vector ωW = ωTIT of polar motion has no component along the rotation axis \(\vec {n} = \vec {e}_3^{IT} \), i.e., when \({\omega }_W^3 = 0\). Thus we have given a precise mathematical context to the requirements for the choice of the tuning angles s and s, which the astronomers have called the Non Rotating Origin principle (NRO) (see [18], for the original ideas and [29], for the rigorous mathematical elaboration). Note that in astronomy directions are depicted as points on a unit sphere and the “origin” in this case is the point representing the \(\vec {e}_{IC}^1 \) direction. It must therefore hold that \(\omega _Q^3 = [\boldsymbol {\upomega }_Q \times ]_{21} = \left (\mathbf {Q}\frac {d{\mathbf {Q}}^T}{dt}\right ) _{21} = 0\) and \(\omega _W^3 = [\boldsymbol {\upomega }_W \times ]_{21} = \left (\mathbf {W}\frac {d{\mathbf {W}}^T}{dt}\right )_{21} = 0\), in which case performing the necessary differentiations we arrive at the final NRO conditions
$$\displaystyle \begin{aligned} \frac{ds}{dt} = (\cos d - 1)\frac{dE}{dt},\quad \frac{d{s}^{\prime}}{dt} = (\cos g - 1)\frac{dF}{dt}, \end{aligned} $$
(18)
of which the first is indeed an NRO condition, while the second should rather be called a Rotating Origin (RO) condition. The obvious solutions to the above equations are
$$\displaystyle \begin{aligned} s(t) = s_0 + \int\nolimits_{t_0 }^t {[\cos d(t) - 1]} \frac{dE}{dt}(t)dt,\qquad {s}^{\prime}(t) = {s}^{\prime}_0 + \int\nolimits_{t_0 }^t {[\cos g(t) - 1]} \frac{dF}{dt}(t)dt. \end{aligned} $$
(19)
The official IERS representation of the earth rotation does not implement the above angles E, d for precession-nutation and F, g for polar motion. For precession-nutation the first two X, Y of the celestial components of the rotation direction \(\vec {n} = \vec {\mathbf {e}}^C{\mathbf {n}}_C \) are used instead, which are related to E, d according to
$$\displaystyle \begin{aligned} {\mathbf{n}}_C = \left[ \begin{array}{c} X \\ Y \\ Z \end{array} \right] = \left[ \begin{array}{c} {\cos \,E\,\sin \,d} \\ {\sin \,E\,\sin \,d} \\ {\cos \,d} \end{array} \right]. \end{aligned} $$
(20)
In terms of X and Y the precession-nutation matrix assumes the representation
$$\displaystyle \begin{aligned} \mathbf{Q} = {\mathbf{R}}_3 ( - s)\left[ \begin{array}{ccc} {1 - aX^2} & { - aXY} & { - X} \\ { - aXY} & {1 - aY^2} & { - Y} \\ X & Y & {1 - a(X^2 + Y^2)} \end{array} \right], \end{aligned} $$
(21)
where
$$\displaystyle \begin{aligned} a=\frac{1}{1 + \cos d} = \frac{1}{1 + \sqrt{1 - X^2 - Y^2} } \approx \frac{1}{2} + \frac{1}{8}(X^2 + Y^2). \end{aligned} $$
(22)
The IERS does not follow the obvious symmetric alternative for the representation of the polar motion matrixW in terms of \(\xi = \cos F\sin g\), \(\eta = \sin F\sin g\), i.e., the first two of the components nT = [ξηζ]T of the rotation direction \(\vec {n} = \vec {\mathbf {e}}^T{\mathbf {n}}_T \), in the terrestrial reference system. Instead it sticks to the traditional representation \(\mathbf {W} = {\mathbf {R}}_1 (-y_{{ }_{P}} ){\mathbf {R}}_2 ( - x_{{ }_{P}} )\), with an additional rotation by the tuning angle s so that
$$\displaystyle \begin{aligned} \mathbf{W} = {\mathbf{R}}_1 ( - y_{{}_{P}} ){\mathbf{R}}_2 ( - x_{{}_{P}} ){\mathbf{R}}_3 ({s}^{\prime}). \end{aligned} $$
(23)
The NRO conditions in terms of the new parameters X, Y , xP, yP take the form
$$\displaystyle \begin{aligned} \dot{s} = a(\dot{X}Y - X\dot{Y}),\qquad \qquad {\dot{s}}^{\prime} = \dot{y}_P \sin x_P \approx \dot{y}_p x_p . \end{aligned} $$
(24)
The tuning angle s in terms of the above representation is given by
$$\displaystyle \begin{aligned} s(t) & = s_0 + \int\nolimits_{t_0 }^t {a(t)[X(t)\dot{Y}} (t) - Y(t)\dot{X}(t)]dt\approx\\ &\approx s_0 - \frac{1}{2}[X(t)Y(t) - X(t_0 )Y(t_0 )] + \int\nolimits_{t_0 }^t \dot{X} (t)Y(t)dt, \end{aligned} $$
(25)
where \(\dot {X} = \frac {dX}{dt}\), \(\dot {Y} = \frac {dY}{dt}\) and the value s0 = −94 μas is chosen in order to secure continuity with the abandoned previous version of earth rotation representation.
Note that since xP, yP are quite small quantities, it holds in first order approximation that ξ ≈−xP, η ≈ yP. The tuning angle s in terms of xP, yP is given by
$$\displaystyle \begin{aligned} {s}^{\prime}(t) = \frac{1}{2}\int\nolimits_{t_0 }^t [x_p (t)\dot{y}_p (t) - \dot{x}_p (t)y_p (t)]dt \approx - 47\upmu\mathrm{as}\ t. \end{aligned} $$
(26)
With the above representations the total earth rotation is described by
$$\displaystyle \begin{aligned} {\mathbf{x}}_T &= \mathbf{Rx}_C =\\ & = {\mathbf{R}}_1 ( - y_P ){\mathbf{R}}_2 ( - x_p ){\mathbf{R}}_3 ({s}^{\prime} + \theta {-} s)\left[ \begin{array}{ccc} {1 - aX^2} & { - aXY} & { - X} \\ { - aXY} & {1 - aY^2} & { - Y} \\ X & Y & {1 - a(X^2 + Y^2)} \end{array} \right]{\mathbf{x}}_C . \end{aligned} $$
(27)
The functions X(t), Y (t), s(y) can be evaluated either in terms of given series with coefficients provided by the IERS or with the use of software subroutines provided by the International Astronomic Union. These functions are provided by the IAU adopted theory and one needs to add to the theoretical values XIAU, YIAU corrections provided by VLBI observations, according to X = XIAU + δX, Y = YIAU + δY . These corrections are provided by IERS in discrete form on a daily basis. Similar daily discrete values are provided for the polar motion parameters xP, yP. The earth rotation angle is calculated from
$$\displaystyle \begin{aligned} \theta (T_u ) = 2\pi (0.7790572732640 + 1.00273781191135448T_u ), \end{aligned} $$
(28)
where Tu = Julian date UT1 − 2451545.0 is the number of UT1 Julian days that have passed since 12h UT1, January 1, 2000 (when the Julian date UT1 was 2451545). UT1 time is computed from UTC time (= TAI + n sec), to which one adds the difference UT1-UTC provided by the IERS.

We have repeatedly used above the term “rotation of the earth”. This should be understood as a convenient short-name for the correct term “rotation of the terrestrial reference system”. It makes sense to talk for the rotation of the earth, only in the case of a rigid earth, where the rotation vector remains physically invariant, whatever the choice of the time-independent earth-fixed terrestrial reference system. In the actual case of the deforming earth, the terrestrial reference system needs to be defined for every time epoch and different choices lead to physically different rotation vectors of the terrestrial system (and not of the earth!).

Another characteristic of the above IERS representation of the rotation of the terrestrial reference system is the replacement of the actual instantaneous rotation vector with smoothed versions, originally the Celestial Ephemeris Pole (CEP) and presently the Celestial Intermediate Pole (CIP). The reasoning behind this choice is that high frequency sub-daily terms in nutation could not be detected by the (then primarily classical astronomical) observations. Thus, they should be removed from precession-nutation and be matched by corresponding terms in polar motion (which cannot be predicted by theory) in such a way that the total rotation matrix remains unchanged. Leaving aside the absurdity of the idea that one should replace the model of an observed time function with a smoothed version because of the limited temporal resolution capabilities of the observational process, we must remark that modern space techniques are quite capable for detecting sub-diurnal nutation terms. Therefore, the CIP concept needs to be updated.

In accordance with the above terminology, the direction of the first axis of the intermediate celestial reference system \(\vec {e}_1^{IC} \) is called the Celestial Intermediate Origin (CIO), while the direction of the first axis of the intermediate terrestrial reference system \(\vec {e}_1^{IT} \) is called the Terrestrial Intermediate Origin (TIO).

More details on the IERS representation of earth rotation are given in the IERS Conventions [52].

Looking on the original rigorous representation of the rotation matrix
$$\displaystyle \begin{aligned} \mathbf{R} = {\mathbf{R}}_3 ( - F){\mathbf{R}}_2 ( - g){\mathbf{R}}_3 (F + {s}^{\prime} + \theta - s - E){\mathbf{R}}_2 (d){\mathbf{R}}_3 (E), \end{aligned} $$
(29)
we note that it involves 7 parameters (time functions), while an orthogonal matrix can be represented by only three parameters. Therefore, the implemented parameters must satisfy a number of conditions which at first sight appear to be 7 − 3 = 4. More careful examination reveals the fact that the parameters s and s are not independent, since only their difference s− s appears in the representation. Indeed the representation remains the same if s and s are replaced by \(\overline {s} = s + f\) and \({\overline {s}}^{\prime } =\)s + f, where f is an arbitrary time function. Therefore the representation depends on 6 parameters (F, g, s− s, θ, d, E) which must satisfy 6−3 = 3 conditions. In order to find these conditions we remark that the representation involves the explicit use of a rotation axis \(\vec {e}_3^{IT} = \vec {e}_3^{IC} \), independently of whether this is the instantaneous rotation axis, or the CIP, or any other convenient choice. Furthermore the rotation must have an angular velocity of \(\dot {\theta } = \frac {d\theta }{dt}\), so that the rotation vector corresponding to R, must be of the form \(\vec {\omega } = \dot {\theta }\vec {e}_3^{IT} = \dot {\theta }\vec {e}_3^{IC} \) and setting \(\vec {\omega } = \vec {\mathbf {e}}^{IT}\boldsymbol {\upomega }_{IT} = \vec {\mathbf {e}}^{IC}\boldsymbol {\upomega }_{IC} \) the components in both intermediate systems must be \(\boldsymbol {\upomega }_{IT} = \boldsymbol {\upomega }_{IC}=[00\dot {\theta }]^T\). We may find the components ωT of \(\vec {\omega } = \vec {\mathbf {e}}^T\boldsymbol {\upomega }_T \) in the terrestrial reference system using the generalized Euler kinematic equations \([\boldsymbol {\upomega }_T \times ] = \mathbf {R}\frac {d{\mathbf {R}}^T}{dt}\) and then convert them in the intermediate terrestrial system with the polar motion transformation ωIT = WTωT. Thus, the desired conditions follow by setting
$$\displaystyle \begin{aligned} \boldsymbol{\upomega}_{IT} = {\mathbf{W}}^T\boldsymbol{\upomega}_T = \left[ \begin{array}{c} 0 \\ 0 \\ \dot{\theta } \end{array} \right], \end{aligned} $$
(30)
with ωT derived from the Euler kinematic equations. Performing the necessary differentiations (see [27, 30, 33] for details) we arrive at the following three differential equations
$$\displaystyle \begin{aligned} &\mathbf{R}(E + s)\left[ \begin{array}{c} {\dot{E}\sin d} \\ \dot{d} \end{array} \right] = \mathbf{R}(\theta )\mathbf{R}(F + {s}^{\prime})\left[ \begin{array}{c} {\dot{F}\sin g} \\ \dot{g} \end{array} \right], \end{aligned} $$
(31)
$$\displaystyle \begin{aligned} &{\dot{s}}^{\prime} + \dot{F} - \cos g\dot{F} = \dot{s} + \dot{E} - \cos d\dot{E}, \end{aligned} $$
(32)
which constitute the compatibility conditions that the superfluous Earth Orientation Parameters (EOPs) must satisfy for the rotation matrix representation (29) to be mathematically consistent. The first two compatibility conditions (31) are the direction conditions, which assert that \(\omega _{IT}^1 = 0\) and \(\omega _{IT}^2 = 0\), i.e., that the rotation vector induced by the rotation matrix R as given by Eq. (29) has the same direction as the common third axis of the intermediate celestial and the intermediate terrestrial reference system (direction of diurnal rotation): \(\vec {\omega }\| \vec {e}_3^{IT} = \vec {e}_3^{IC} \). The third compatibility condition (32) is the magnitude condition, which asserts that \(\omega _{IT}^3 = \dot {\theta }\). When all three conditions are satisfied then obviously \(\omega = \vert \vec {\omega }\vert = \dot {\theta }\). The satisfaction of the two NRO conditions (18), namely \(\dot {s} = (\cos d - 1)\dot {E}\), \({\dot {s}}^{\prime } = (\cos g - 1)\dot {F}\) clearly guarantees the satisfaction of (32). Since however the direction conditions, which are ignored in the IERS representation are not satisfied, it is simply wrong to assume that \(\omega = \dot {\theta }\), since \(\omega _{IT}^1 \ne 0\), \(\omega _{IT}^2 \ne 0\) and thus \(\omega \ne \omega _{IT}^3 = \dot {\theta }\). We may obtain ω from the relation
$$\displaystyle \begin{aligned} \omega^2 = \dot{\theta }^2 + \Delta \dot{\theta }^2, \end{aligned} $$
(33)
where \(\Delta \dot {\theta }^2 = (\omega _{IT}^1 )^2 + (\omega _{IT}^2 )^2\) is given by
$$\displaystyle \begin{aligned} \Delta \dot{\theta }^2 &= \dot{E}^2\sin^2d + \dot{d} + \dot{F}^2\sin^2g + \dot{g}^2-\\ &\quad - 2\cos ({s}^{\prime} + F + \theta - s - E)(\dot{E}\dot{F}\sin d\sin g + \dot{d}\dot{g}) - \\ &\quad - 2\sin ({s}^{\prime} + F+\theta - s - E)(\dot{E}\dot{g}\sin d - \dot{F}\dot{d}\sin g). \end{aligned} $$
(34)
For the IERS representation of Eq. (29) the compatibility conditions take the form
$$\displaystyle \begin{aligned} &\mathbf{R}( - s)\left[ \begin{array}{cc} {\left( { - \frac{a^2}{1 - a}XY} \right)\dot{X} - } & {\left( {1 + \frac{a^2}{1 - a}Y^2} \right)\dot{Y}} \\ {\left( {1 + \frac{a^2}{1 - a}X^2} \right)\dot{X} + } & {\left( {\frac{a^2}{1 - a}XY} \right)\dot{Y}} \end{array} \right] = \mathbf{R}( - \theta - {s}^{\prime})\left[ \begin{array}{c} {\dot{y}_P \cos x_P } \\ {\dot{x}_P } \end{array} \right], \end{aligned} $$
(35)
$$\displaystyle \begin{aligned} &aY\dot{X} - aX\dot{Y} - \dot{s} = \dot{y}_P \sin x_p - {\dot{s}}^{\prime}, \end{aligned} $$
(36)
for direction and modulus respectively. The correction \(\Delta \dot {\theta }^2\) for the rotational velocity is given by
$$\displaystyle \begin{aligned} \Delta \dot{\theta }^2 &= \frac{(a^2XY)^2 + (1 - a + a^2X^2)^2}{(1 - a)^2}\dot{X}^2 + \frac{(1 - a + a^2Y^2)^2 + (a^2XY)^2}{(1 - a)^2}\dot{Y}^2 + \\ &\quad + 2\frac{a^2XY\left[2 - 2a + a^2(X^2 + Y^2)\right]}{(1 - a)^2}\dot{X}\dot{Y} + \dot{y}_P^2 \cos^2x_P + x_P^2\, + \\ &\quad + 2\frac{\sin (s - {s}^{\prime} - \theta )}{1 - a}\left\{\cos x_P \left[(1 - a + a^2X^2)\dot{X}\dot{y}_P + (a^2XY)\dot{Y}\dot{y}_P \right]\right.+\\ &\quad \left.+ (a^2XY)\dot{X}\dot{x}_P + (1 - a + a^2Y^2)\dot{Y}\dot{x}_P \right\} + \\ &\quad + 2\frac{\cos (s - {s}^{\prime} - \theta )}{1 - a}\left\{\cos x_P \left[(a^2XY)\dot{X}\dot{y}_P + (1 - a + a^2Y^2)\dot{Y}\dot{y}_P \right]\right.-\\ &\quad \left. - (1 - a + a^2X^2)\dot{X}\dot{x}_P - (a^2XY)\dot{Y}\dot{x}_P \right\}. \end{aligned} $$
(37)
The above IERS representation of earth rotation is effective since January 1st 2003 and is based on the IAU2000 resolutions of the International Astronomical Union. The former representation differs from the new one in two aspects: the first is the separation of nutation from precession and the second is in the definition of the two origins (directions of the first axes) that define diurnal rotation. The previous representation was of the form
$$\displaystyle \begin{aligned} \mathbf{R} &= \mathbf{WR}_3 (\mathrm{GST})\,\mathbf{NP} =\\ &= [{\mathbf{R}}_1 (-y_p ){\mathbf{R}}_2 ( - x_p )]{\mathbf{R}}_3 \mathrm{(GST})[{\mathbf{R}}_1 ( - \varepsilon - \Delta \varepsilon ){\mathbf{R}}_3 ( - \Delta \psi ){\mathbf{R}}_1 (\varepsilon )]\times\\ &\quad \times [{\mathbf{R}}_3 ( - z){\mathbf{R}}_2 (\theta ){\mathbf{R}}_3 ( - \zeta )], \end{aligned} $$
(38)
where GST is the Greenwich Sidereal Time, z, θ, ζ are the precession angles, while nutation is defined by the obliquityε, the nutation in obliquity Δε, and the nutation in longitude Δψ. The representation of polar motion is essentially the same, except for the tuning rotation R3(s) which is missing, along with its celestial counterpart R3(−s). The precession matrix P transforms coordinates xC in the celestial system \(\vec {\mathbf {e}}^C\) into coordinates xMC = PxC in the mean celestial system \(\vec {\mathbf {e}}^{MC}\). The nutation matrix N transforms coordinates xMC in the mean celestial system\(\vec {\mathbf {e}}^{MC}\) into coordinates xTC = NxMC in the true celestial system\(\vec {\mathbf {e}}^{TC}\), having the axis \(\vec {\mathbf {e}}_3^{TC} \) in the direction of the Celestial Ephemeris Pole (CEP) which differs from the direction of the instantaneous rotation vector \(\vec {\omega }\) in a way similar to that of the CIP in the new representation. The diurnal rotation matrix R3 (GST) transforms coordinates xTC in the true celestial system \(\vec {\mathbf {e}}^{TC}\) into coordinates xTT = R3(GST)xTC in the true terrestrial system \(\vec {\mathbf {e}}^{TT}\) having its third axis \(\vec {e}_3^{TT} = \vec {e}_3^{TC} \) also in the direction of the CEP. Finally the polar motion matrix W transforms coordinates xTT in the true terrestrial system into coordinates xT = WxTT in the terrestrial system. The first axis \(\vec {e}_1^{TC} \) of the true celestial system is in the direction of the vernal equinox^ , which is the intersection of the true equator (plane of \(\vec {\mathbf {e}}_1^{TC} \), \(\vec {\mathbf {e}}_2^{TC} )\) with the ecliptic (plane of the orbit of the earth). The first axis \(\vec {e}_1^{MC} \) of the mean celestial system is in the direction of the mean vernal equinox^m, which is the intersection of the mean equator (plane of \(\vec {e}_1^{MC} \), \(\vec {e}_2^{MC} )\) with the ecliptic. The direction of the first axis \(\vec {\mathbf {e}}_1^{TT} \) of the true celestial system has no specific definition. It is simply the direction, which results from the transformation \(\vec {\mathbf {e}}^{TT} = \vec {\mathbf {e}}^T\mathbf {W} = \vec {\mathbf {e}}^T{\mathbf {R}}_1 ( - y_P ){\mathbf {R}}_2 ( - x_P )\). A different choice of rotations than R1(−yP)R2(−xP), which also bring the \(\vec {e}_3^T \) in the direction \(\vec {e}_3^{TT} \) of the CEP, would have resulted in a different direction of \(\vec {e}_1^{TT} \). The celestial system\(\vec {\mathbf {e}}^C\) was defined to be no other than the mean celestial system of a specific reference epoch, namely 12 UT (universal Time) of January 1, 2000. A former choice has been the same hour and date for 1950.

4 The Realization of a Reference System Within Data Analysis, in the Case of Rigid Geodetic Networks

In performing data analysis one normally uses a mathematical model xa = f(ya) relating n observables ya to m < n unknown parameters xa in an unambiguous way, i.e., by an injective mapping f : xa →ya, such that for every ya ∈ M ≡ R(f) ⊂ Rn there exists a unique xa such that xa = f(ya). In classical geodesy however, we observe quantities ya, e.g., angles or distances, which depend only on the geometric form of the geodetic network, while coordinates are used as unknown parameters xa. Thus to any value of the observables ya corresponds an infinite set of unknowns xa, which are the coordinates expressing the network form specified by the observables, in the various possible reference systems. Thus, in order to obtain a unique solution, the reference system has to be chosen, either a priori, or within the data analysis process. The mapping f is no longer injective.

As xa varies over Rm (space of the unknowns) the corresponding images f(xa) do not cover the whole of Rn (space of the observations) but only a submanifold
$$\displaystyle \begin{aligned} M = R({\mathbf f}) = \{{\mathbf{y}}^a = {\mathbf f}({\mathbf{x}}^a)\vert {\mathbf{x}}^a \in R^m\}, \end{aligned} $$
(39)
which we will call the observables manifold. It has dimension r = m − d, where d is the number of parameters defining the reference system, e.g., 6 for a 3D network with unknown origin and orientation, or 7, if the network scale is also unknown. This means that there are d superfluous in the m coordinates of the network, as many as the parameters of a transformation that corresponds to a change of the reference system.
To any given ya ∈ M there corresponds a solution manifold or shape manifold
$$\displaystyle \begin{aligned} S_{{\mathbf{y}}^a} = \{{\mathbf{x}}^a \in R^n\vert \mathbf{f}({\mathbf{x}}^a) = {\mathbf{y}}^a\}, \end{aligned} $$
(40)
consisting of all coordinate sets xa giving the same observables ya and the same network configuration. As ya varies over M, the various corresponding manifolds \(S_{{\mathbf {y}}^a} \) have two characteristics:
  1. (a)

    they do not intersect (\(S_{{\mathbf {y}}^a} \cap S_{\tilde {\mathbf {y}}^a} = \varnothing \) for \(\tilde {\mathbf {y}}^a \ne {\mathbf {y}}^a)\) and

     
  2. (b)

    they fill up the parameter space Rm (given any xa ∈ Rm there exists a unique manifold \(S_{\mathbf {f}({\mathbf {x}}^a)} \) to which xa belongs).

     
In mathematical terms we say that the shape manifolds constitute a fiberingF of Rm, with each \(S_{{\mathbf {y}}^a}\) being a fiber of F.
One way to define the reference system is by means of an appropriate set of minimal constraints c(xa) = 0, which define a manifold C = {xa ∈ Rm|c(xa) = 0} such that for every ya ∈ M, C and \(S_{{\mathbf {y}}^a} \) have a single point in common \(C \cap S_{{\mathbf {y}}^a} =\)\(\{{\mathbf {x}}_{C,{\mathbf {y}}^a}^a \}\) (see Fig. 1). Such a manifold C is called a section of the fiberingF.
Fig. 1

The geometry of the non-linear (above) and the linearized (below) least squares solutions

Let \({\mathbf {x}}^a \in S_{{\mathbf {y}}^a} \) and \(\tilde {\mathbf {x}}^a \in S_{{\mathbf {y}}^a} \) be two points on the same shape manifold, i.e., \(\mathbf {f}({\mathbf {x}}^a) = \mathbf {f}(\tilde {\mathbf {x}}^a) = {\mathbf {y}}^a\). As they represent the same network configuration there exists a coordinate transformation
$$\displaystyle \begin{aligned} \tilde{\mathbf{x}}^a = T_{\mathbf{p}} ({\mathbf{x}}^a) = \mathbf{t}({\mathbf{x}}^a,\mathbf{p}), \end{aligned} $$
(41)
mapping one into the other, where p are the d transformation parameters. For example, in a three-dimensional network we have the general similarity transformation
$$\displaystyle \begin{aligned} \tilde{\mathbf{x}}^a = \mathbf{t}({\mathbf{x}}^a,\mathbf{p}) = \mathbf{t}({\mathbf{x}}^a;s,\boldsymbol{\uptheta },\mathbf{d}) = (1 + s)\mathbf{R}(\boldsymbol{\uptheta }){\mathbf{x}}^a + \mathbf{d}, \end{aligned} $$
(42)
where p = [θTdTs]T, s is the scale parameter, θ = [θ1θ2θ3]T are the rotation angles and d = [d1d2d3]T the translation components. Fixing xa, every other point \(\tilde {\mathbf {x}}^a \in S_{{\mathbf {y}}^a} \) is in one-to-one correspondence with the transformation parameters from xa to \(\tilde {\mathbf {x}}^a = \mathbf {t}({\mathbf {x}}^a,\mathbf {p})\). This means that the transformation parameters from the fixed xa may serve as a set of curvilinear coordinates on \(S_{{\mathbf {y}}^a} \). In particular the tangent vectors to the coordinates \(\frac {\partial \mathbf {t}}{\partial p_i }({\mathbf {x}}^a,\mathbf {p} = \mathbf {0})\), i.e., the columns of the matrix Ea = \(\mathbf {E}({\mathbf {x}}^a) \equiv \frac {\partial \mathbf {t}}{\partial \mathbf {p}}({\mathbf {x}}^a,\mathbf {p} = \mathbf {0})\), form a basis for the tangent space\(N_a = T_{{\mathbf {x}}^a} (S_{{\mathbf {y}}^a} )\) to the shape manifold \(S_{{\mathbf {y}}^a} \) at the point xa.
In actual data analysis, the outcomes of the observations yb = ya + e differ for the values of the observables ya, as a consequence of the unavoidable observation errors e, and ybM. Thus an estimate \(\hat {\mathbf {y}}^a \in M\) of ya needs to be chosen in order to obtain a unique estimate \(\hat {\mathbf {x}}^a\) of the unknown parameters with the help of the additional minimal constraints c(xa) = 0. The usual choice is estimation by the least squares method where the square weighted distance \(\phi {=} \vert \vert {\mathbf {y}}^b {-} \mathbf {f}({\mathbf {x}}^a)\vert \vert _{\mathbf {P}}^2 {=} [{\mathbf {y}}^b {-} \mathbf {f}({\mathbf {x}}^a)]^T\mathbf {P}[{\mathbf {y}}^b {-} \mathbf {f}({\mathbf {x}}^a)]\) of yb from the manifold M is minimized. Application of \(\phi = \min \) under the conditions c(xa) = 0 leads to the rather complicated nonlinear normal equations
$$\displaystyle \begin{aligned} \left[ {\frac{\partial \mathbf{f}}{\partial {\mathbf{x}}^a}(\hat{\mathbf{x}}_{NL}^a )} \right]^T\mathbf{P}[{\mathbf{y}}^b - \mathbf{f}(\hat{\mathbf{x}}_{NL}^a )] + \left[ {\frac{\partial \mathbf{c}}{\partial {\mathbf{x}}^a}(\hat{\mathbf{x}}_{NL}^a )} \right]^T\mathbf{k} = \mathbf{0},\quad \mathbf{c}(\hat{\mathbf{x}}_{NL}^a ) = \mathbf{0}, \end{aligned} $$
(43)
which must be solved for \(\hat {\mathbf {x}}_{NL}^a \) and the Lagrange multipliers k. The observables solution \(\hat {\mathbf {y}}_{NL}^a = \mathbf {f}(\hat {\mathbf {x}}_{NL}^a )\) is in fact the orthogonal projection of the observations yb on the observables manifold M (see Fig. 1). All elements of the corresponding shape manifold \(S_{\hat {\mathbf {y}}_{NL}^a } \) are least squares solutions for the unknown parameters xa, corresponding to the same shape of the geodetic network expressed in different reference systems. The nonlinear constraints c(xa) = 0 merely serve in picking a unique one \(\hat {\mathbf {x}}_{NL}^a \) out of all elements of \(S_{\hat {\mathbf {y}}_{NL}^a } \), thus choosing a particular reference system. The constraints manifoldC = {xa ∈ Rm|c(xa) = 0} of all points satisfying the constraints is a section of the fibering F of all shape manifolds \(S_{{\mathbf {y}}^a} \), provided that it intersects each one of them at a single point \({\mathbf {x}}_{C,{\mathbf {y}}^a} \), i.e., \(C \cap S_{{\mathbf {y}}^a} = \{{\mathbf {x}}_{c,{\mathbf {y}}^a} \}\), ∀ya ∈ M. A necessary and sufficient condition for this (transversality condition) is that the tangent spaces \(T_{{\mathbf {x}}^a} (C)\), \(T_{{\mathbf {x}}_a } (S_{\mathbf {f}({\mathbf {x}}_a )} )\) to the section C and the solution manifold \(S_{\mathbf {f}({\mathbf {x}}_a )} \), respectively, at each point xa ∈ C have only the origin 0 as common element [34].
To avoid solving the complicated nonlinear normal equations (43) by numerical analysis techniques, an iterative scheme is used, based on some initiating approximate values x0 for xa and linearization of the mathematical model and the minimal constraints, if they are not already linear. The linearized model becomes
$$\displaystyle \begin{aligned} \mathbf{b} \equiv {\mathbf{y}}^b - \mathbf{f}({\mathbf{x}}_0 ) = \left[ {\frac{\partial \mathbf{f}}{\partial {\mathbf{x}}^a}({\mathbf{x}}_0 )} \right]({\mathbf{x}}^a - {\mathbf{x}}_0 ) + \mathbf{e} \equiv \mathbf{Ax} + \mathbf{e}, \end{aligned} $$
(44)
and the linearized minimal constraints
$$\displaystyle \begin{aligned} \mathbf{c}({\mathbf{x}}_0 ) + \left[ {\frac{\partial \mathbf{c}}{\partial {\mathbf{x}}^a}({\mathbf{x}}_0 )} \right]({\mathbf{x}}^a - {\mathbf{x}}_0 ) \equiv - \mathbf{d} + {\mathbf{C}}^T\mathbf{x} = \mathbf{0}. \end{aligned} $$
(45)
The original unknowns xa are replaced by the unknown corrections x = xa −x0, the observations yb by the reduced observations b = yb −f(x0) ≡yb −y0 and the observables ya by the corrections to the observables y = ya −f(x0) = ya −y0. The nonlinear model ya = f(xa) is replaced by the linear model y = Ax and the nonlinear observation equations yb = f(xa) + e by the linearized ones b = Ax + e. The implementation of a linear model allows us to solve the “choice of weight matrix problem” under the assumption that the errors are random variables with zero mean E{v} = 0 and covariance matrix C = E{vvT} = σ2Q, known up to an unknown scalar factor σ2. According to the celebrated Gauss-Markov theorem the choice P = Q−1 provides Best Uniformly Unbiased Estimates (BLUUE or simply BLUE) \(\hat {q} = {\mathbf {a}}^T\hat {\mathbf {x}}\) for all estimable linear functions q = aTx of the parameters, i.e., quantities which are functions q = dTy = dTAx of the observables, or equivalently functions q = aTx with a = ATd ∈ R(AT).
The essence of the linearization lies in the assumption that the approximate values x0 produce an approximate geometric configuration of the network, expressed by the corresponding approximate values of the observables y0 = f(x0), which is close to the true geometric configuration of the network, expressed by the true values of the observables ya = f(xa). As observational errors e are small, this will allow the estimation process to produce a linear estimate \(\hat {\mathbf {y}}^a = \mathbf {f}(\hat {\mathbf {x}}^a)\) which is close to the true value of the observables ya = f(xa). Even in this case, there are infinitely many coordinate sets \(\hat {\mathbf {x}}^a\), which do produce the same value \(\hat {\mathbf {y}}^a\), but on the other hand may vary significant and be largely different from the approximate values x0. For this reason, we must replace the idea of arbitrary coordinate transformations from one reference system to another, and seek only solutions xa which are close to the approximate values x0. This will restrict the coordinate transformations, and the corresponding choices of reference system, to those provided by coordinate transformations close to the identity, i.e., transformations \(\tilde {\mathbf {x}}_i^a = (1 + s)\mathbf {R}(\boldsymbol {\uptheta }){\mathbf {x}}_i^a + \mathbf {d}\) with very small parameter values s, θ, d (recall that zero transformation values correspond to the identity transformation). In this case, the approximate linearized close to the identity transformation takes the form \(\tilde {\mathbf {x}}_i^a = {\mathbf {x}}_i^a + s{\mathbf {x}}_i^a + [{\mathbf {x}}_i^a \times ]\boldsymbol {\uptheta } + \mathbf {d}\), which results by utilizing the approximation R(θ) = I − [θ×] and neglecting second and higher order terms. Replacing the entries \({\mathbf {x}}_i^a \) of xa with \({\mathbf {x}}_i^a = {\mathbf {x}}_i^{ap} + {\mathbf {x}}_i \), where \({\mathbf {x}}_i^{ap} \) are the approximate coordinates of station i (entries of x0) and xi the corresponding corrections (entries of x), the transformation becomes
$$\displaystyle \begin{aligned} \tilde{\mathbf{x}}_i \approx {\mathbf{x}}_i + s{\mathbf{x}}_i^{ap} + [{\mathbf{x}}_i^{ap} \times ]\boldsymbol{\uptheta } + \mathbf{d} = {\mathbf{x}}_i + \left[ {[{\mathbf{x}}_i^{ap} \times ] \quad {\mathbf{I}}_3 \quad {\mathbf{x}}_i^{ap} } \right]\left[ \begin{array}{c} \boldsymbol{\uptheta } \\ \mathbf{d} \\ s \end{array} \right] \equiv {\mathbf{x}}_i + {\mathbf{E}}_i \mathbf{p}. \end{aligned} $$
(46)
For all stations the linearized coordinate transformation takes the form \(\tilde {\mathbf {x}} = \mathbf {x} + \mathbf {Ep}\), where
$$\displaystyle \begin{aligned} \mathbf{E} = \left[ \begin{array}{c} \vdots \\ {{\mathbf{E}}_i } \\ \vdots \end{array} \right] = \left[ \begin{array}{ccc} \vdots & \vdots & \vdots \\ {} {[{\mathbf{x}}_i^{ap} \times ]} & {{\mathbf{I}}_3 } & {{\mathbf{x}}_i^{ap} } \\ {} \vdots & \vdots & \vdots \end{array} \right] = \left[ \begin{array}{ccccccc} \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 0 & { - z_i^{ap} } & { y_i^{ap} } & 1 & 0 & 0 & {x_i^{ap} } \\ {} {z_i^{ap} } & 0 & { - x_i^{ap} } & 0 & 1 & 0 & {y_i^{ap} } \\ {} { - y_i^{ap} } & {x_i^{ap} } & 0 & 0 & 0 & 1 & {z_i^{ap} } \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \end{array} \right]. \end{aligned} $$
(47)
This is the linearized three-dimensional similarity transformation, involving rotation, translation and scale change. If one of these transformation elements is missing, the corresponding columns of Ei and rows of p should be omitted. For example in the rigid transformation without scale change \({\mathbf {E}}_i = \left [ {\left [ {x_i^{ap} \times } \right ]\quad {\mathbf {I}}_3 } \right ]\) and \(\mathbf {p} = \left [ \begin {array}{c} \boldsymbol {\uptheta } \\ \mathbf {d} \end {array} \right ]\). For the case of two-dimensional planar networks, and introducing the matrix of planar rotation
$$\displaystyle \begin{aligned} \mathbf{R}(\theta ) = \left[ \begin{array}{cc} {\cos \theta } & {\sin \theta } \\ { - \sin \theta } & {\cos \theta } \end{array} \right], \end{aligned} $$
(48)
the similarity transformation\(\tilde {\mathbf {x}}_i^a = (1 + s)\mathbf {R}(\theta ){\mathbf {x}}_i^a + \mathbf {d}\) takes the linearized form
$$\displaystyle \begin{aligned} \tilde{\mathbf{x}}_i \approx {\mathbf{x}}_i + s{\mathbf{x}}_i^{ap} + [\mathbf{Wx}_i^{ap} ]\theta + \mathbf{d} = {\mathbf{x}}_i + \left[ {\mathbf{Wx}_i^{ap} \quad {\mathbf{I}}_2 \quad {\mathbf{x}}_i^{ap} } \right]\left[ \begin{array}{c} \theta \\ \mathbf{d} \\ s \end{array} \right] \equiv {\mathbf{x}}_i + {\mathbf{E}}_i \mathbf{p}, \end{aligned} $$
(49)
where W = R(90). For all network points the transformation becomes \(\tilde {\mathbf {x}} = \mathbf {x} + \mathbf {Ep}\), where
$$\displaystyle \begin{aligned} \mathbf{E} = \left[ \begin{array}{c} \vdots \\ {{\mathbf{E}}_i } \\ \vdots \end{array} \right] = \left[ \begin{array}{ccc} \vdots & \vdots & \vdots \\ {\mathbf{Wx}_i^{ap} } & {{\mathbf{I}}_2 } & {{\mathbf{x}}_i^{ap} } \\ \vdots & \vdots & \vdots \end{array} \right] = \left[ \begin{array}{cccc} \vdots & \vdots & \vdots & \vdots \\ {y_i^{ap} } & 1 & 0 & {x_i^{ap} } \\ { - x_i^{ap} } & 0 & 1 & {y_i^{ap} } \\ \vdots & \vdots & \vdots & \vdots \end{array} \right]. \end{aligned} $$
(50)
In order to understand the relation between the original nonlinear and the linearized estimation problem, we first note that as any coordinate, e.g., \(x_i^a,\) varies in R, with the rest of the coordinates remaining fixed, the image \(\mathbf {f}(x_i^a )\) traces a curve on the observables manifold M and the partial derivative \(\frac {\partial \mathbf {f}}{\partial x_i^a }\) is a tangent vector to this curve and hence tangent to M. In particular the columns of the matrix A, which are by definition the derivatives \({\mathbf {a}}_i = \frac {\partial \mathbf {f}}{\partial x_i^a }({\mathbf {x}}_0 )\) are vectors tangent to M at the point y0 = f(x0) ∈ M. Therefore the m columns of A span the manifold A tangent to M at y0 = f(x0), and since the range of A is the set of the linear combinations of its columns then A = R(A). On the other side, consider within \(S_{{\mathbf {y}}_0 } = S_{\mathbf {f}({\mathbf {x}}_0 )} \) a change of reference system transformation \(\tilde {\mathbf {x}}_0 = \mathbf {t}({\mathbf {x}}_0 , \mathbf {p})\). As each pi varies, holding the rest of the transformation parameters p fixed, \(\tilde {\mathbf {x}}_0 = (p_i )\) traces a curve on \(S_{{\mathbf {y}}_0 } \) and the partial derivative \(\frac {\partial \tilde {\mathbf {x}}_0 }{\partial p_i } = \frac {\partial t}{\partial p_i }\) is a vector tangent to this curve and hence tangent to \(S_{{\mathbf {y}}_0 } \). Therefore the d columns
$$\displaystyle \begin{aligned} {\mathbf{e}}_i =\frac{\partial \mathbf{t}}{\partial p_i }({\mathbf{x}}_0 , \mathbf{p} = {\mathbf 0}), \end{aligned} $$
(51)
of the matrix
$$\displaystyle \begin{aligned} \mathbf{E} = \frac{\partial \mathbf{t}}{\partial \mathbf{p}}({\mathbf{x}}_0 ,\mathbf{p} = \mathbf{0}), \end{aligned} $$
(52)
span the linear manifold \(N = T_{{\mathbf {x}}_0 } (S_{{\mathbf {y}}_0 } ) = R(\mathbf {E})\) tangent to \(S_{{\mathbf {y}}_0 } = S_{\mathbf {f}({\mathbf {x}}_0 )} \) at x0. Considering the combined function (f ∘t)(xa, p) ≡f(t(xa, p)) and recalling that \(\mathbf {f}\left ( {\mathbf {t}\left ( {{\mathbf {x}}^a,\mathbf {p}} \right )} \right ) = \mathbf {f}\left ( {{\mathbf {x}}^a} \right ) = {\mathbf {y}}^a\), ∀p, with ya independent of p, the chain rule for derivatives gives
$$\displaystyle \begin{aligned} \frac{\partial (\mathbf{f} \circ \mathbf{t})}{\partial \mathbf{p}}({\mathbf{x}}^a,\mathbf{0}) = \left( {\frac{\partial \mathbf{f}}{\partial \mathbf{t}}\frac{\partial \mathbf{t}}{\partial \mathbf{p}}} \right)({\mathbf{x}}^a,\mathbf{0}) = {\mathbf{A}}_a {\mathbf{E}}_a = \frac{\partial {\mathbf{y}}^a}{\partial \mathbf{p}} = \mathbf{0}, \end{aligned} $$
(53)
where we have set \({\mathbf {A}}_a = \frac {\partial \mathbf {f}}{\partial {\mathbf {x}}^a}({\mathbf {x}}^a)\) and \({\mathbf {E}}_a = \frac {\partial \mathbf {t}}{\partial \mathbf {p}}({\mathbf {x}}^a, \mathbf {0})\) as before. In particular at x0 it holds that
$$\displaystyle \begin{aligned} \mathbf{AE} = \mathbf{0}, \end{aligned} $$
(54)
where \(\mathbf {A} = \frac {\partial \mathbf {f}}{\partial {\mathbf {x}}^a}({\mathbf {x}}_0 )\) and \(\mathbf {E} = \frac {\partial \mathbf {t}}{\partial \mathbf {p}}({\mathbf {x}}_0 , \mathbf {p} = {\mathbf 0})\) as before. The relation (54) means that each column ek, k = 1, 2, …, d of E, is mapped by A into Aek = 0. Since the columns of E are linearly independent they form a basis for the null spaceN(A) of A and the tangent manifold defined above can now be identified as
$$\displaystyle \begin{aligned} N = T_{{\mathbf{x}}_0 } (S_{{\mathbf{y}}_0 } ) = R(\mathbf{E}) = N(\mathbf{A}) \equiv \{\mathbf{x} \in R^m\vert \mathbf{Ax} = \mathbf{0}\}. \end{aligned} $$
(55)
The situation is similar at xa where \(N_a = T_{{\mathbf {x}}^a} (S_{{\mathbf {y}}^a} ) = R({\mathbf {E}}_a ) = N({\mathbf {A}}_a )\).
The linearization replaces the original nonlinear mapping f : RmRn : xaya, from Rm to Rn with its derivative mapping
$$\displaystyle \begin{aligned} \mathbf{A} = \nabla \mathbf{f}:T_{{\mathbf{x}}_0 } (R^m) \to T_{{\mathbf{y}}_0 } (R^n), \end{aligned} $$
(56)
from the tangent space \(T_{{\mathbf {x}}_0 } (R^m)\) (at x0 to Rm) to the tangent space \(T_{{\mathbf {y}}_0 } (R^n)\) (at y0 = f(x0) to Rn). Since \(T_{{\mathbf {x}}_0 } (R^m)\) and \(T_{{\mathbf {y}}_0 } (R^n)\) are essentially Rm and Rn with their origins shifted to x0 and y0 = f(x0), respectively, we may simply consider the linear mapping A : Rm → Rn : x →y. The estimation by least squares provides a unique reduced observable estimate \(\hat {\mathbf {y}} = \mathbf {A}\hat {\mathbf {x}}\), which is the projection of the reduced observations b = yb −y0 = yb −f(x0) on the image manifold A by minimizing the weighted norm
$$\displaystyle \begin{aligned} \vert \vert \mathbf{b} - \mathbf{y}\vert \vert_{\mathbf{p}}^2 = \vert \vert \mathbf{b} - \mathbf{Ax}\vert \vert_{\mathbf{p}}^2 = (\mathbf{b} - \mathbf{Ax})^T\mathbf{P}(\mathbf{b} - \mathbf{Ax}) = \min . \end{aligned} $$
(57)
Note that, as a consequence of the linearization errors, \(\hat {\mathbf {y}}_L^a \equiv {\mathbf {y}}_0 + \hat {\mathbf {y}}\) differs from the least squares solution \(\hat {\mathbf {y}}_{NL}^a \) to the nonlinear problem (\(\hat {\mathbf {y}}_L^a \ne \hat {\mathbf {y}}_{NL}^a\)) and also \(\hat {\mathbf {y}}_L^a \notin M\). If \(\hat {\mathbf {x}}\) is a solution to the linearized least squares problem (\(\mathbf {A}\hat {\mathbf {x}} = \hat {\mathbf {y}}\)) then \(\hat {\mathbf {x}}_L^a \equiv {\mathbf {x}}_0 + \hat {\mathbf {x}} \ne \hat {\mathbf {x}}_{NL}^a \) for all least squares solutions \(\hat {\mathbf {x}}_{NL}^a \) to the nonlinear problem. From \(\hat {\mathbf {x}}\) we may obtain any other solution \({\hat {\mathbf {x}}^{\prime }} = \hat {\mathbf {x}} + \mathbf {Ep}\) by adding a linear combination Ep of the columns of E, i.e., any vector belonging to the null space of A (Ep ∈ N(A)=R(E)). As a consequence of the linearization errors, \({\hat {\mathbf {x}}}_L^{\prime a} \equiv {\mathbf {x}}_0 + {\hat {\mathbf {x}}^{\prime }} = {\mathbf {x}}_0 + \hat {\mathbf {x}} + \mathbf {Ep} = \hat {\mathbf {x}}_L^a + \mathbf {Ep}\) differs from any corresponding nonlinear estimate \({\hat {\mathbf {x}}}_{NL}^{\prime a} \) (\({\hat {\mathbf {x}}}_L^{\prime a} \ne \hat {\tilde {\mathbf {x}}}_{NL}^a )\). In fact \({\hat {\mathbf {x}}}_{NL}^{\prime a} \notin S_{\hat {\mathbf {y}}_{NL}^a } \). However, when the approximate values are sufficiently close to the real ones, these differences turn out to be insignificant in most real-life problems, since they remain below the level of the observational errors, which cause the difference between the estimate \(\hat {\mathbf {y}}_L^a \equiv {\mathbf {y}}_0 + \hat {\mathbf {y}}\) and the (unknown) true value of ya to be potentially considerably larger than the difference between the linear(ized) estimate \(\hat {\mathbf {y}}_L^a \) and the nonlinear one \(\hat {\mathbf {y}}_{NL}^a \).
Within the linearized approach, given any \(\hat {\mathbf {x}}\) such that \(\hat {\mathbf {y}} = \mathbf {A}\hat {\mathbf {x}}\), we may seek upon all other least squares solutions \({\hat {\mathbf {x}}^{\prime }} = \hat {\mathbf {x}} + \mathbf {Ep}\), p ∈ Rd, the one \(\hat {\mathbf {x}}_E \) with minimum length \(\vert \vert \hat {\mathbf {x}}_E \vert \vert = \mathop {\min } \limits _{\mathbf {p}} \vert \vert {\hat {\mathbf {x}}^{\prime }}\vert \vert = \mathop {\min } \limits _{\mathbf {p}} \sqrt {{\hat {\mathbf {x}}}^{\prime T}{\hat {\mathbf {x}}^{\prime }}} \), so that \(\hat {\mathbf {x}}_E^a = {\mathbf {x}}_0 + \hat {\mathbf {x}}_E \) is as close as possible to the approximate values x0. Equivalently, we may minimize instead \(\phi = \vert \vert {\hat {\mathbf {x}}^{\prime }}\vert \vert ^2 = {\hat {\mathbf {x}}}^{\prime T}{\hat {\mathbf {x}}^{\prime }} = (\hat {\mathbf {x}} + \mathbf {Ep})^T(\hat {\mathbf {x}} + \mathbf {Ep})\), by setting \(\frac {\partial \phi }{\partial \mathbf {p}} = \mathbf {0}\), which leads to \({\mathbf {p}}_E = - ({\mathbf {E}}^T\mathbf {E})^{ - 1}{\mathbf {E}}^T\hat {\mathbf {x}}\) and thus
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_E = \hat{\mathbf{x}} + \mathbf{Ep}_E = \hat{\mathbf{x}} - \mathbf{E}\,({\mathbf{E}}^T\mathbf{E})^{ - 1}{\mathbf{E}}^T\hat{\mathbf{x}}.\end{aligned} $$
(58)
It is easy to verify that the minimum norm solution \(\hat {\mathbf {x}}_E \) satisfies the so called inner constraints\({\mathbf {E}}^T\hat {\mathbf {x}}_E = \mathbf {0}\). In fact, it is completely identified among all least squares solutions by the inner constraints, since if \({\hat {\mathbf {x}}^{\prime }} = \hat {\mathbf {x}} + \mathbf {Ep}\) is another least squares solution satisfying \({\mathbf {E}}^T{\hat {\mathbf {x}}^{\prime }} = {\mathbf {E}}^T\hat {\mathbf {x}} + {\mathbf {E}}^T\mathbf {E}\,\mathbf {p} = \mathbf {0}\) then
$$\displaystyle \begin{aligned} \mathbf{0} = {\mathbf{E}}^T{\hat{\mathbf{x}}^{\prime}} - {\mathbf{E}}^T\hat{\mathbf{x}}_E = {\mathbf{E}}^T\hat{\mathbf{x}} - {\mathbf{E}}^T\mathbf{Ep} - ({\mathbf{E}}^T\hat{\mathbf{x}} + {\mathbf{E}}^T\mathbf{Ep}_E ) = {\mathbf{E}}^T\mathbf{E}(\mathbf{p} - {\mathbf{p}}_E ) = \mathbf{0} \end{aligned}$$
implies that p = pE and hence \({\hat {\mathbf {x}}^{\prime }} = \hat {\mathbf {x}}_E \). Note that if \(\hat {\mathbf {x}}\) and \({\hat {\mathbf {x}}^{\prime }}\) are any two elements of \(S_{\hat {\mathbf {y}}} \) then \(\mathbf {A}({\hat {\mathbf {x}}^{\prime }} - \hat {\mathbf {x}}) = \mathbf {A}\hat {\mathbf {x}}^{\prime } - \mathbf {A}\hat {\mathbf {x}} = \hat {\mathbf {y}} - \hat {\mathbf {y}} = {\mathbf 0}\) and \({\hat {\mathbf {x}}^{\prime }} - \hat {\mathbf {x}} \in N(\mathbf {A})\). Thus \(S_{\hat {\mathbf {y}}} \) is a linear variety (hyperplane) parallel to the null space N(A) and for a known \(\hat {\mathbf {x}} \in S_{\hat {\mathbf {y}}} \), it can be represented as a linear variety \(S_{\hat {\mathbf {y}}} =\hat {\mathbf {x}} + N(\mathbf {A})\). The minimum norm solution can be visualized as the orthogonal projection of the origin 0 (moved to x0) on \(S_{\hat {\mathbf {y}}} \).

As y varies over the range linear manifold A = R(A), the corresponding solution manifolds Sy from a fibering F of Rm, although in this case, the term quotient space is rather used in linear algebra. In this case the section C of the elements satisfying the linear constraints CTx = d is also a linear manifold such that intersects each Sy at a single element. A necessary and sufficient condition for this is that it intersects the null space at a single element, C ∩ N(A) = {xC,N}. In this respect, the linear constraints CTx = d are characterized as minimal constraints, because they provide a single element \(\hat {\mathbf {x}}_C \in C \cap S_{\hat {\mathbf {y}}} \) out of all least squares solutions \(\hat {\mathbf {x}} \in S_{\hat {\mathbf {y}}} \), thus choosing a particular reference system. Note that \(\hat {\mathbf {x}}_C = \hat {\mathbf {x}}_E + {\mathbf {x}}_{C,N} \) holds in this case.

The inner constraints ETx = 0 are a particular set of minimal constraints which provide the minimum norm solution, so that \(\vert \vert \hat {\mathbf {x}}_E \vert \vert = \vert \vert \hat {\mathbf {x}}_E^a - {\mathbf {x}}_0 \vert \vert = \mathop {\min } \limits _{\hat {\mathbf {x}} \in S_{\hat {\mathbf {y}}} } \). The reference system to which the final coordinates \(\hat {\mathbf {x}}_E^a = {\mathbf {x}}_0 + \hat {\mathbf {x}}_E \) belong can be interpreted as follows: The approximate coordinates x0 define an approximate geodetic network with a reference system attached to it. The least squares solution \(\hat {\mathbf {y}}^a = {\mathbf {y}}_0 + \hat {\mathbf {y}} = \mathbf {f}({\mathbf {x}}_0 ) + \hat {\mathbf {y}}\) for the observables fully determines the geometric configuration of the network. The minimum norm principle \(\vert \vert \hat {\mathbf {x}}_E^a - {\mathbf {x}}_0 \vert \vert ^2 =\)\((\hat {\mathbf {x}}_E^a - {\mathbf {x}}_0 )^T(\hat {\mathbf {x}}_E^a - {\mathbf {x}}_0 ) = \mathop {\min } \limits _{{\hat {\mathbf {x}}^a_E} \in S_{\hat {\mathbf {y}}} } \) provides a least squares fit (with identity weight matrix) of the estimated network configuration to the approximate one. Once the best fit is realized, the estimated network configuration inherits the reference system of the approximate network.

The so called inner constraint matrixE is easily accessible in geodesy, on the basis of the understanding of the physical reasons that cause the rank deficiency of the design matrixA. This fact provides a series of results that have escaped the attention of statisticians.

In the above discussion we considered only network station coordinates as unknowns. In a real problem, there might be other types of parameters of which some will be invariant under the change of the reference system and some will not. For the latter it suffices to know how they modify under a change of the reference system in order to determine through linearization the total inner constraints matrix referring to all the unknown parameters of our particular model.

5 Least Squares Estimation for Models Without Full Rank Utilizing Minimal Constraints

We proceed with the solution to the problem of estimation of unknown parameters in the case of a linear model without full rank
$$\displaystyle \begin{aligned} \mathbf{b} = \mathbf{Ax} + \mathbf{e},\qquad \mathbf{e}\sim (\mathbf{0}, \sigma^2{\mathbf{P}}^{ - 1}), \end{aligned} $$
(59)
where A has dimensions n × m and rank r(A) = r < m < n, with a rank defect d = m − r.
To apply the least squares principle \(\phi \,{=}\, \vert \vert \mathbf {b} \,{-}\, \mathbf {y}\vert \vert _{\mathbf {P}}^2 \,{=}\, (\mathbf {b} \,{-}\, \mathbf {Ax})^T\mathbf {P}(\mathbf {b} \,{-}\, \mathbf {Ax}) \,{=}\, \min \) under the additional minimal constraints CTx = d we form the Lagrangean Φ = ϕ − 2kT(CTx −d) and set \(\frac {\partial \varPhi }{\partial \mathbf {x}} = \mathbf {0}\) and \(\frac {\partial \varPhi }{\partial \mathbf {k}} = \mathbf {0}\), which leads to the linear normal equations
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} \mathbf{N} & \mathbf{C} \\ {{\mathbf{C}}^T} & \mathbf{0} \end{array} \right]\left[ \begin{array}{c} \hat{\mathbf{x}} \\ \mathbf{k} \end{array} \right] = \left[ \begin{array}{c} \mathbf{u} \\ \mathbf{d} \end{array} \right], \end{aligned} $$
(60)
involving Lagrange multipliers k, where we have set N = ATPA and u = ATPb. The solution of the above equations, and in particular the inversion of the augmented coefficient matrix, relies on the fact that the matrices N and C are not completely independent. Indeed for the set of d = m − rank(A) constraints CTx = d to be a set of minimal constraints one of the following two equivalent conditions must hold
$$\displaystyle \begin{aligned} rank\left[ \begin{array}{c} \mathbf{A} \\ {{\mathbf{C}}^T} \end{array} \right] = rank\left[ \begin{array}{c} \mathbf{N} \\ {{\mathbf{C}}^T} \end{array} \right] = m. \end{aligned} $$
(61)
Another form of the inverse sought, relies on the matrix E satisfying the relation AE = 0, which also implies that NE = 0. As a consequence of the above interdependences the following relations hold
$$\displaystyle \begin{aligned} &(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{C} = \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}, \end{aligned} $$
(62)
$$\displaystyle \begin{aligned} &(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} = \mathbf{I} - \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}{\mathbf{C}}^T, \end{aligned} $$
(63)
$$\displaystyle \begin{aligned} &{\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}{\mathbf{A}}^T = \mathbf{0}, \end{aligned} $$
(64)
$$\displaystyle \begin{aligned} &{\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} = \mathbf{0}, \end{aligned} $$
(65)
$$\displaystyle \begin{aligned} &\mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} = \mathbf{N}, \end{aligned} $$
(66)
$$\displaystyle \begin{aligned} &{\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{C} = \mathbf{I}, \end{aligned} $$
(67)
$$\displaystyle \begin{aligned} &(\mathbf{N} + \mathbf{CC}^T)^{ - 1} - (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}=\\ &\qquad = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{-E}({\mathbf{E}}^T\mathbf{CC}^T\mathbf{E})^{ - 1}{\mathbf{E}}^T, \end{aligned} $$
(68)
$$\displaystyle \begin{aligned} &(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{u} + \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}\mathbf{d} = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{u} + (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{Cd}. \end{aligned} $$
(69)
The proof of the above relations is rather straightforward. From NE = 0, we have (N + CCT)E = NE + CCTE = CCTE, and since both CTE and N + CCT are regular (N + CCT)E(CTE)−1 = C, E(CTE)−1 = (N + CCT)−1C, which is (62).
For the proof of (63) we multiply E(CTE)−1 = (N + CCT)−1C from the right with CT and get
$$\displaystyle \begin{aligned} \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}{\mathbf{C}}^T &= (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T =\\ &= (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T + (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N}- (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} = \\ &= (\mathbf{N} + \mathbf{CC}^T)^{ - 1}(\mathbf{N} + \mathbf{CC}^T) - (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} = \\ &=\mathbf{I} - (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} \end{aligned} $$
which gives (63).

For the proof of (64) we multiply the transpose of (62) CT(N + CCT)−1 = (ETC)−1ET with AT from the right and since ETAT = (AE)T = 0 we obtain CT(N + CCT)−1AT = (ETC)−1ETAT = 0, which is (64).

For the proof of (65) we multiply the transpose of (62) with N from the right and since ETN = 0 we obtain CT(N + CCT)−1N = (ETC)−1ETN = 0, which is (65).

For the proof of (66) we take into account the transpose of (65) N(N + CCT)−1C = 0 to obtain
$$\displaystyle \begin{aligned} \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} &= \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} + \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T-\\ &\quad - \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T =\\ &= \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}(\mathbf{N} + \mathbf{CC}^T) - \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T = \mathbf{N} \end{aligned} $$
and (66) has been proved.
For the proof of (67) we start from (65),
$$\displaystyle \begin{aligned} \mathbf{0} &= {\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N} = {\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}(\mathbf{N} + \mathbf{CC}^T - \mathbf{CC}^T)=\\ & = {\mathbf{C}}^T - {\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T = \mathbf{0} \end{aligned} $$
Multiplying from the right with C gives
$$\displaystyle \begin{aligned} &{\mathbf{C}}^T\mathbf{C} - {\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T\mathbf{C} = \mathbf{0},\\ &{\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T\mathbf{C}) = {\mathbf{C}}^T\mathbf{C}\ \mathrm{and}\ {\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{C} = {\mathbf{C}}^T\mathbf{C}({\mathbf{C}}^T\mathbf{C})^{ - 1} = \mathbf{I}. \end{aligned} $$
Taking into account (62) and its transpose CT(N + CCT)−1 = (ETC)−1ET we directly obtain
$$\displaystyle \begin{aligned} (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1} = \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}({\mathbf{E}}^T\mathbf{C})^{ - 1}{\mathbf{E}}^T = \mathbf{E}({\mathbf{E}}^T\mathbf{CC}^T\mathbf{E})^{ - 1}\mathbf{E} \end{aligned}$$
and (68) is proved.

To prove (69) we simply replace on the left side (N + CCT)−1C = E(CTE)−1 from (62).

Setting
$$\displaystyle \begin{aligned} {\mathbf{Q}}_C = (\mathbf{N} + \mathbf{CC}^T)^{ - 1} - \mathbf{E}({\mathbf{E}}^T\mathbf{CC}^T\mathbf{E})^{ - 1}{\mathbf{E}}^T, \end{aligned} $$
(70)
it is easy to establish that
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_C \mathbf{C} = \mathbf{0},\qquad {\mathbf C}^T{\mathbf{Q}}_C = \mathbf{0}, \end{aligned} $$
(71)
$$\displaystyle \begin{aligned} &\mathbf{NQ}_C + \mathbf{C}({\mathbf{E}}^T\mathbf{C})^{ - 1}{\mathbf{E}}^T = \mathbf{I}, \end{aligned} $$
(72)
$$\displaystyle \begin{aligned} &\mathbf{NQ}_C + \mathbf{N} = \mathbf{N}, \end{aligned} $$
(73)
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_C \mathbf{NQ}_C = {\mathbf{Q}}_C . \end{aligned} $$
(74)
Using (62) it follows that QCC = (N + CCT)−1C −E(CTE)−1(ETC)−1ETC = E(CTE)−1 −E(CTE)−1 = 0 and (71) is proved. Utilizing NE = 0 and the transpose of (63), namely N(N + CCT)−1 = I −C(ETC)−1ET we have
$$\displaystyle \begin{aligned} \mathbf{NQ}_C &= \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1} - \mathbf{NE}({\mathbf{C}}^T\mathbf{E})^{ - 1}({\mathbf{E}}^T\mathbf{C})^{ - 1}{\mathbf{E}}^T = \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}=\\ &= \mathbf{I} - \mathbf{C}({\mathbf{E}}^T\mathbf{C})^{ - 1}{\mathbf{E}}^T \end{aligned} $$
and (72) follows. Multiplying (72) with N from the right we get NQCN + C(ETC)−1ETN = N and since ETN = 0, (73) follows. Multiplying (72) with QC from the left, it follows that QCNQC + QCC(ETC)−1ET = QC and since according to (71) QCC = 0, (74) follows.
The solution of the normal equations (60) has two different but equivalent forms. We shall first obtain the form implementing the inner constraints matrix E and next the form which it does not. The inverse of the coefficient matrix in the augmented normal equations has the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} \mathbf{N} & \mathbf{C} \\ {{\mathbf{C}}^T} & \mathbf{0} \end{array} \right]^{ - 1} = \left[ \begin{array}{cc} {{\mathbf{Q}}_C } & {\mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}} \\ {({\mathbf{E}}^T\mathbf{C})^{ - 1}{\mathbf{E}}^T} & \mathbf{0} \end{array} \right]. \end{aligned} $$
(75)
This relation is easy to verify by multiplying the original matrix with its inverse and taking into account that NE = 0, as well as the relations (71) and (72). Indeed Utilizing the above inverse, the solution takes the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} \hat{\mathbf{x}} \\ \mathbf{k} \end{array} \right] &= \left[ \begin{array}{cc} \mathbf{N} & \mathbf{C} \\ {{\mathbf{C}}^T} & \mathbf{0} \end{array} \right]^{ - 1}\left[ \begin{array}{c} \mathbf{u} \\ \mathbf{d} \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{Q}}_C } & {\mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}} \\ {({\mathbf{E}}^T\mathbf{C})^{ - 1}{\mathbf{E}}^T} & \mathbf{0} \end{array} \right]\left[ \begin{array}{c} \mathbf{u} \\ \mathbf{d} \end{array} \right] = \\ &= \left[ \begin{array}{c} {{\mathbf{Q}}_C \mathbf{u} + \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}\mathbf{d}} \\ {({\mathbf{E}}^T\mathbf{C})^{ - 1}{\mathbf{E}}^T\mathbf{u}} \end{array} \right],\end{aligned} $$
(77)
or explicitly \(\hat {\mathbf {x}} = {\mathbf {Q}}_C \mathbf {u} + \mathbf {E}({\mathbf {C}}^T\mathbf {E})^{ - 1}\mathbf {d}\) and k = (ETC)−1ETu.
From AE=0 and its transpose ETAT = 0 it follows that ETu = ETATPb = 0 and k = (ETC)−1ETu = 0, while QCu = (N + CCT)−1u −E(ETCCTE)−1ETu = (N + CCT)−1u and thus \(\hat {\mathbf {x}}_C = (\mathbf {N} + \mathbf {CC}^T)^{ - 1}\mathbf {u} + \mathbf {E}({\mathbf {C}}^T\mathbf {E})^{ - 1}\mathbf {d}\). Therefore the solution to the least squares problem in the rank deficient model with the implementation of minimal constraints CTx = d is given by
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_C = {\mathbf{Q}}_C \mathbf{u} + \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}\mathbf{d} = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{u} + \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}\mathbf{d},\qquad \mathbf{k} = \mathbf{0}.\end{aligned} $$
(78)
The covariance matrix of \(\hat {\mathbf {x}}_C \) is given by \({\mathbf {C}}_{\hat {\mathbf {x}}_C } = \sigma ^2{\mathbf {Q}}_{\hat {\mathbf {x}}_C } \), where the covariance factor matrix \({\mathbf {Q}}_{\hat {\mathbf {x}}_C } \) can be obtained from the known covariance factor matrix of the observations Qb = P−1 and the linear relation (78) in the form \(\hat {\mathbf {x}}_C =\)QCATPb + E(CTE)−1d. Application of the law of covariance propagation gives \({\mathbf {Q}}_{\hat {\mathbf {x}}_C } = {\mathbf {Q}}_C {\mathbf {A}}^T\mathbf {PQ}_{\mathbf {b}} \left [ {{\mathbf {Q}}_C {\mathbf {A}}^T\mathbf {P}} \right ]^T = {\mathbf {Q}}_C {\mathbf {A}}^T\mathbf {PAQ}_C = {\mathbf {Q}}_C \mathbf {NQ}_C \) and in view of (74)
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{\hat{\mathbf{x}}_C } = {\mathbf{Q}}_C = (\mathbf{N} + \mathbf{CC}^T)^{ - 1} - \mathbf{E}({\mathbf{E}}^T\mathbf{CC}^T\mathbf{E})^{ - 1}{\mathbf{E}}^T.\end{aligned} $$
(79)
In order to derive a form of the solution to the normal equations that does not implement the matrix E, we need to make use of the relation
$$\displaystyle \begin{aligned} (\mathbf{N} + \mathbf{CC}^T)^{ - 1} - (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1} = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}, \end{aligned} $$
(80)
which is easy to prove by setting R = N + CCT and noting that
$$\displaystyle \begin{aligned} (\mathbf{N} + \mathbf{CC}^T)^{ - 1} \mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1} &= {\mathbf{R}}^{ - 1}\mathbf{NR}^{ - 1} =\\ &= {\mathbf{R}}^{-1} - {\mathbf{R}}^{ - 1} + {\mathbf{R}}^{ - 1}\mathbf{NR}^{ - 1}=\\ &= {\mathbf{R}}^{ - 1} - {\mathbf{R}}^{ - 1}\mathbf{RR}^{ - 1} + {\mathbf{R}}^{ - 1}\mathbf{NR}^{ - 1} =\\ &= {\mathbf{R}}^{ - 1} - {\mathbf{R}}^{ - 1}(\mathbf{R} - \mathbf{N}){\mathbf{R}}^{ - 1}=\\ &= {\mathbf{R}}^{ - 1} - {\mathbf{R}}^{ - 1}\mathbf{CC}^T{\mathbf{R}}^{ - 1}=\\ & = (\mathbf{N} \,{+}\, \mathbf{CC}^T)^{-1} {-} (\mathbf{N} {+} \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T(\mathbf{N} \,{+}\, \mathbf{CC}^T)^{ - 1}. \end{aligned} $$
The inverse of the coefficient matrix of the normal equations is given by
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} \mathbf{N} & \mathbf{C} \\ {{\mathbf{C}}^T} & \mathbf{0} \end{array} \right]^{ - 1} = \left[ \begin{array}{cc} {\mathbf{{Q}^{\prime}}_C } & {(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{C}} \\ {{\mathbf{C}}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}} & \mathbf{0} \end{array} \right], \end{aligned} $$
(81)
where
$$\displaystyle \begin{aligned} \mathbf{{Q}^{\prime}}_C &= (\mathbf{N} + \mathbf{CC}^T)^{ - 1} - (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}=\\ &= (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}. \end{aligned} $$
(82)
In order to verify (81) we take into account that QC = R−1 −R−1CCTR−1, CTR−1N = 0 and NR−1C = 0 from (65), while CTR−1C = I from (67). Multiplying the original matrix with its inverse we obtain
$$\displaystyle \begin{aligned} &\left[ \begin{array}{cc} \mathbf{N} & \mathbf{C} \\ {{\mathbf{C}}^T} & \mathbf{0} \end{array} \right]\left[ \begin{array}{cc} {{\mathbf{R}}^{ - 1} - {\mathbf{R}}^{ - 1}\mathbf{CC}^T{\mathbf{R}}^{ - 1}} & {{\mathbf{R}}^{ - 1}\mathbf{C}} \\ {{\mathbf{C}}^T{\mathbf{R}}^{ - 1}} & \mathbf{0} \end{array} \right] = \\ &\quad = \left[ \begin{array}{cc} {\mathbf{NR}^{ - 1} - \mathbf{NR}^{ - 1}\mathbf{CC}^T{\mathbf{R}}^{ - 1} + \mathbf{CC}^T{\mathbf{R}}^{ - 1}} & {\mathbf{NR}^{ - 1}\mathbf{C}} \\ {{\mathbf{C}}^T{\mathbf{R}}^{ - 1} - {\mathbf{C}}^T{\mathbf{R}}^{ - 1}\mathbf{CC}^T{\mathbf{R}}^{ - 1}} & {{\mathbf{C}}^T{\mathbf{R}}^{ - 1}\mathbf{C}} \end{array} \right] = \\ &\quad = \left[ \begin{array}{cc} {\mathbf{NR}^{ - 1} - \mathbf{0C}^T{\mathbf{R}}^{ - 1} + \mathbf{CC}^T{\mathbf{R}}^{ - 1}} & \mathbf{0} \\ {{\mathbf{C}}^T{\mathbf{R}}^{ - 1} - \mathbf{IC}^T{\mathbf{R}}^{ - 1}} & \mathbf{I} \end{array} \right] = \left[ \begin{array}{cc} {(\mathbf{N}+\mathbf{CC}^T){\mathbf{R}}^{ - 1}} & \mathbf{0} \\ \mathbf{0} & \mathbf{I} \end{array} \right] = \left[ \begin{array}{cc} \mathbf{I} & \mathbf{0} \\ \mathbf{0} & \mathbf{I} \end{array} \right]. \end{aligned} $$
(83)
Utilizing the above inverse, the solution takes the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} \hat{\mathbf{x}} \\ \mathbf{k} \end{array} \right] &= \left[ \begin{array}{cc} \mathbf{N} & \mathbf{C} \\ {{\mathbf{C}}^T} & \mathbf{0} \end{array} \right]^{ - 1}\left[ \begin{array}{c} \mathbf{u} \\ \mathbf{d} \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{R}}^{ - 1} - {\mathbf{R}}^{ - 1}\mathbf{CC}^T{\mathbf{R}}^{ - 1}} & {{\mathbf{R}}^{ - 1}\mathbf{C}} \\ {{\mathbf{C}}^T{\mathbf{R}}^{ - 1}} & \mathbf{0} \end{array} \right]\left[ \begin{array}{c} \mathbf{u} \\ \mathbf{d} \end{array} \right]=\\ &= \left[ \begin{array}{cc} {{\mathbf{R}}^{ - 1}\mathbf{u} - {\mathbf{R}}^{ - 1}\mathbf{CC}^T{\mathbf{R}}^{ - 1}\mathbf{u} + {\mathbf{R}}^{ - 1}\mathbf{Cd}} \\ {{\mathbf{C}}^T{\mathbf{R}}^{ - 1}\mathbf{u}} \end{array} \right], \end{aligned} $$
(84)
explicitly \(\hat {\mathbf {x}} = {\mathbf {R}}^{ - 1}\mathbf {u} - {\mathbf {R}}^{ - 1}\mathbf {CC}^T{\mathbf {R}}^{ - 1}\mathbf {u} + {\mathbf {R}}^{ - 1}\mathbf {Cd}\) and k = CTR−1u. Recalling that from (64) CTR−1AT = 0 it follows that k = CTR−1u = CTR−1ATPb = 0 and \(\hat {\mathbf {x}} = {\mathbf {R}}^{ - 1}\mathbf {u} + {\mathbf {R}}^{ - 1}\mathbf {Cd}\). Therefore, the solution to the normal equations is given by
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_C = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}(\mathbf{u} + \mathbf{Cd}),\quad \mathbf{k} = \mathbf{0}.\end{aligned} $$
(85)
Covariance propagation on \(\hat {\mathbf {x}} = {\mathbf {R}}^{ - 1}\mathbf {u} + {\mathbf {R}}^{ - 1}\mathbf {Cd} = {\mathbf {R}}^{ - 1}{\mathbf {A}}^T \mathbf {Pb} + {\mathbf {R}}^{ - 1}\mathbf {Cd}\) with Qb = P−1 gives
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{\hat{\mathbf{x}}_{{}_C }} = ({\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{P}){\mathbf{P}}^{ - 1}({\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{P})^T = {\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{PAR}^{ - 1} = {\mathbf{R}}^ {-1} \mathbf{NR}^{ - 1} = \mathbf{{Q}^{\prime}}_{C.} \end{aligned} $$
(86)
Therefore the covariance matrix of \(\hat {\mathbf {x}}_C \) is \({\mathbf {C}}_{\hat {\mathbf {x}}_C } = \sigma ^2{\mathbf {Q}}_{\hat {\mathbf {x}}_C } \) with covariance factor matrix
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{\hat{\mathbf{x}}_{{}_C }} &= \mathbf{{Q}^{\prime}}_C = (\mathbf{N} + \mathbf{CC}^T)^{ - 1} - (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{CC}^T(\mathbf{N} + \mathbf{CC}^T)^{ - 1}\\ & = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}\mathbf{N}(\mathbf{N} + \mathbf{CC}^T)^{ - 1}.\end{aligned} $$
(87)
If \(\hat {\mathbf {x}}\) is a least squares solution obtained by a set of minimal constraints, which may even be unknown to us, we can convert it into a solution \(\hat {\mathbf {x}}_C \) satisfying a specific set of minimal constraints CTx = d, by choosing the right set of parameters p in the linearized coordinate transformation \(\hat {\mathbf {x}}_C = \hat {\mathbf {x}} + \mathbf {Ep}\). Indeed from \({\mathbf {C}}^T\hat {\mathbf {x}}_C =\)\({\mathbf {C}}^T\hat {\mathbf {x}} + {\mathbf {C}}^T\mathbf {Ep} = \mathbf {d}\) follows that \(\mathbf {p} = ({\mathbf {C}}^T\mathbf {E})^{ - 1}(\mathbf {d} - {\mathbf {C}}^T\hat {\mathbf {x}})\) and hence
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_C = [\mathbf{I} - \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}{\mathbf{C}}^T]\hat{\mathbf{x}} + \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}\mathbf{d}.\end{aligned} $$
(88)
Of particular interest is the solution \(\hat {\mathbf {x}}_E \) of minimum norm, which is the solution among all least squares solutions for which the norm ϕ = ||x||2 = xTx is minimized. It can be recovered from any least squares solution \(\hat {\mathbf {x}}\), obtained by minimal constraints with an appropriate choice of the parameters p in the linearized transformation \(\mathbf {x} = \hat {\mathbf {x}} + \mathbf {Ep}\). To find the proper p we minimize ϕ = xTx = \((\hat {\mathbf {x}} + \mathbf {Ep})^T(\hat {\mathbf {x}} + \mathbf {Ep})\) by setting \(\frac {\partial \phi }{\partial \mathbf {p}} = 2(\hat {\mathbf {x}} + \mathbf {Ep})^T\mathbf {E} = \mathbf {0}\), which gives \(\mathbf {p} =- ({\mathbf {E}}^T\mathbf {E})^{ - 1}{\mathbf {E}}^T\hat {\mathbf {x}}\) and the minimum norm solution is given by
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_E = \left[ {\mathbf{I} - \mathbf{E}({\mathbf{E}}^T\mathbf{E})^{ - 1}{\mathbf{E}}^T} \right]\hat{\mathbf{x}}.\end{aligned} $$
(89)
The above relation can be easily generalized to \(\hat {\mathbf {x}}_W = \left [ {\mathbf {I} - ({\mathbf {E}}^T\mathbf {WE})^{ - 1}{\mathbf {E}}^T\mathbf {W}} \right ]\hat {\mathbf {x}}\), which minimizes the weighted norm \(\vert \vert \mathbf {x}\vert \vert _W^2 = {\mathbf {x}}^T\mathbf {Wx}\), for any positive-definite matrix W.

Note that the idempotent matrix \({\mathbf {P}}_{N(\mathbf {A})^ \bot } = \mathbf {I} - \mathbf {E}({\mathbf {E}}^T\mathbf {E})^{ - 1}{\mathbf {E}}^T\) is a projector operator onto the orthogonal complement N(A) of the null space of the design matrix N(A) = {x ∈ Rm|Ax = 0} = R(E) spanned by the columns of E.

Comparing this with the conversion (88) from one least squares to another, it follows that the minimum norm solution can be also obtained by using minimal constraints CTx = d, by simply choosing C = E and d = 0. The constraints ETx = 0, where E is the coefficient matrix in the linearized transformation \(\tilde {\mathbf {x}} =\mathbf {x} + \mathbf {Ep}\) under a change of the reference system, are called inner constraints.

The “inner constraints” or “minimum norm” or “free network” solution derives directly from the minimal constraints solution by replacing C = E and d = 0 to obtain
$$\displaystyle \begin{aligned} &\hat{\mathbf{x}}_E = ( {\mathbf{N} + \mathbf{EE}^T} )^{ - 1}\mathbf{u}, \end{aligned} $$
(90)
If a set of minimal constraints CTx = d is multiplied from the left with any non-singular matrix S, we obtain a completely equivalent set of constraints SCTx = Sd, or \(\tilde {\mathbf {C}}^T\mathbf {x} = \tilde {\mathbf {d}}\) with \(\tilde {\mathbf {C}} = \mathbf {CS}^T\) and \(\tilde {\mathbf {d}} = \mathbf {Sd}\), providing exactly the same least squares solution. The same is true for the inner constraints ETx = 0, which can be replaced with \(\tilde {\mathbf {E}}^T\mathbf {x} = \mathbf {0}\), where \(\tilde {\mathbf {E}} = \mathbf {ER}^T\) with an arbitrary non-singular matrix R. Writing the solutions for minimal and inner constraints with \(\tilde {\mathbf {C}}\), \(\tilde {\mathbf {d}}\), \(\tilde {\mathbf {E}}\) in place of C, d, E and then replacing \(\tilde {\mathbf {C}} = \mathbf {CS}^T\), \(\tilde {\mathbf {d}} = \mathbf {Sd}\), \(\tilde {\mathbf {E}} = \mathbf {ER}^T\), respectively, and setting STS = G, RTR = GE →G, we obtain a slightly generalized form of the minimal and inner constraints solutions
$$\displaystyle \begin{aligned} &\hat{\mathbf{x}}_C = (\mathbf{N} + \mathbf{CGC}^T)^{ - 1}(\mathbf{u} + \mathbf{CGd}) = (\mathbf{N} + \mathbf{CGC}^T)^{ - 1}\mathbf{u} + \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}\mathbf{d}, \end{aligned} $$
(92)
$$\displaystyle \begin{aligned} &\hat{\mathbf{x}}_E = (\mathbf{N} + \mathbf{EGE}^T)^{ - 1}\mathbf{u}, \end{aligned} $$
(94)
where G is now an arbitrary non-singular symmetric matrix. The matrix G should be chosen in such a way that the corresponding diagonal elements of the matrices N and CGCT in the sum N + CGCT (or N and EGET in the sum N + EGET) have the same order of magnitude, thus restricting the effect of round-off numerical errors in the computations.

Some authors arrive at the above results by introducing so called stochastic constraints d = CTx + ed (or 0 = ETx + ed), with ed ∼ (0, σ2G−1) uncorrelated with e. In this case (94) and (95) are simply the solutions to the normal equations based on the two sets b = Ax + e and the stochastic constraints. We strongly dislike this approach because it leads to misinterpretation. The constraints do not represent any actual information based on observational evidence but they are merely a means for a selecting a particular least squares solution out of infinitely many ones, and thus selecting, at the same time, a particular reference system out of infinitely many ones. The choice of a reference system, which is in any case a mathematical convention and not a real physical object, is a purely deterministic process and there is nothing stochastic about it. This becomes obvious from the fact that the resulting solution \(\hat {\mathbf {x}}_C \) which satisfies exactly the “stochastic” constraints, \({\mathbf {C}}^T\hat {\mathbf {x}}_C = \mathbf {d}\), in which case the corresponding error estimates become \(\hat {\mathbf {e}}_d = {\mathbf {C}}^T\hat {\mathbf {x}}_C - \mathbf {d} = \mathbf {0}\) and the pseudo-data d are not adjusted at all!

The solutions with minimal and inner constraints can be associated with generalized inverses of the normal equations matrix N or the design matrix A. Recall that the generalized inverses M of a matrix M are characterized by four properties
$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} {} &\mathbf{M} = \mathbf{MM}^- \mathbf{M}, & \quad (\mathrm{G}1) \end{array}\end{aligned} $$
(96)
$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} {} &{\mathbf{M}}^- = {\mathbf{M}}^- \mathbf{MM}^- , & \quad (\mathrm{G}2) \end{array}\end{aligned} $$
(97)
$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} {} &(\mathbf{PMM}^- )^T = \mathbf{PMM}^- , & \quad (\mathrm{G}3) \end{array}\end{aligned} $$
(98)
$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} {} &({\mathbf{M}}^- \mathbf{MR})^T = {\mathbf{M}}^- \mathbf{MR}. & \quad (\mathrm{G}4) \end{array}\end{aligned} $$
(99)
For a matrix M to be characterized as a generalized inverse of M, it must necessarily satisfy the generalized inverse property (G1). (G2) is the reflexivity property (M being also a generalized inverse of M) and if both (G1) and (G2) are satisfied M is a reflexive generalized inverse of M. (G3) is the least squares property and if (G1) and (G3) are satisfied M is a least-squares generalized inverse of M. (G4) is the minimum norm property and if (G1) and (G4) are satisfied M is a minimum norm generalized inverse of M. If M is n × m, then the least squares property refers to the norm \(\vert \vert \mathbf {y}\vert \vert _{\mathbf {P}}^2 = {\mathbf {y}}^T\mathbf {Py}\) in Rn and the minimum norm property to the norm \(\vert \vert \mathbf {x}\vert \vert _{\mathbf {R}}^2 = {\mathbf {x}}^T\mathbf {Rx}\) in Rm . For a system y = Mx, x = My is a least squares solution if M satisfies (G1) and (G3) i.e., it satisfies ||y −Mx||2 = \((\mathbf {y} - {\mathbf {M}}^- \mathbf {x})^T\mathbf {P}(\mathbf {y} - {\mathbf {M}}^- \mathbf {x}) = \min \). Similarly if y ∈ R(M) so that y = Mx is a consistent set, then x = My is a minimum norm solution if M satisfies (G1) and (G4), i.e., it satisfies \(\vert \vert \mathbf {x}\vert \vert ^2 = {\mathbf {x}}^T\mathbf {Rx} = \min \), among all x for which Mx = y. Finally if a generalized inverse satisfies all properties (G1), (G2), (G3), (G4), it is a unique reflexive, least squares, minimum norm generalized inverse, it is called the pseudoinverse of M and is denoted by M+.
From properties (73) and (74) follows that \({\mathbf {Q}}_{\hat {\mathbf {x}}_C } = {\mathbf {Q}}_C \) satisfies \(\mathbf {NQ}_{\hat {\mathbf {x}}_C } \mathbf {N} = \mathbf {N}\) and \({\mathbf {Q}}_{\hat {\mathbf {x}}_C } \mathbf {NQ}_{\hat {\mathbf {x}}_C } = {\mathbf {Q}}_{\hat {\mathbf {x}}_C } \). It is therefore a reflexive generalize inverse of N. The matrix
$$\displaystyle \begin{aligned} {\mathbf{A}}^- = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}{\mathbf{A}}^T\mathbf{P} = {\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{P}, \end{aligned} $$
(100)
appearing in the solution \(\hat {\mathbf {x}}_C = (\mathbf {N} + \mathbf {CC}^T)^{ - 1}{\mathbf {A}}^T\mathbf {Pb} + (\mathbf {N} + \mathbf {CC}^T)^{ - 1}\mathbf {Cd}\), satisfies properties (G1), (G2), (G3), but not (G4), and is thus a least squares reflexive generalized inverse of the design matrix A. Indeed from (64) CTR−1AT = 0, and thus
$$\displaystyle \begin{aligned} \mathbf{AA}^- \mathbf{A} = \mathbf{AR}^{ - 1}\mathbf{N} = \mathbf{AR}^{ - 1}(\mathbf{R} - \mathbf{CC}^T) = \mathbf{AR}^{ - 1}\mathbf{R} - \mathbf{AR}^{ - 1}\mathbf{CC}^T = \mathbf{AR}^{ - 1} \mathbf{R} = \mathbf{A},\end{aligned} $$
satisfying (G1) while
$$\displaystyle \begin{aligned} {\mathbf{A}}^- \mathbf{AA}^- &= {\mathbf{R}}^{ - 1}\mathbf{NR}^{ - 1}{\mathbf{A}}^T\mathbf{P} = {\mathbf{R}}^{ - 1}(\mathbf{R} - \mathbf{CC}^T){\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{P}=\\ & = {\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{P} - {\mathbf{R}}^{ - 1}\mathbf{CC}^T{\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{P} = {\mathbf{R}}^{ - 1}{\mathbf{A}}^T\mathbf{P} = {\mathbf{A}}^- ,\end{aligned} $$
and (G2) is also satisfied. It is also obvious that PAA = PA(N + CCT)−1ATP is a symmetric matrix and (G3) is satisfied. (G4) is not satisfied for the norm ||x||2 = xTx, because AA = (N + CCT)−1N is not symmetric. Since the inner constraints are just a special case of minimal constraints, the covariance cofactor matrix \({\mathbf {Q}}_{\hat {\mathbf {x}}_{{ }_E }} \) shares with \({\mathbf {Q}}_{\hat {\mathbf {x}}_{{ }_C}} \) the properties \(\mathbf {NQ}_{{\hat {\mathbf {x}}_E}} \mathbf {N} = \mathbf {N}\) (G1) and \({\mathbf {Q}}_{\hat {\mathbf {x}}_{{ }_E }} \,\mathbf {NQ}_{\hat {\mathbf {x}}_{{ }_E }} = {\mathbf {Q}}_{\hat {\mathbf {x}}_{{ }_E }} \) (G2). From \({\mathbf {Q}}_{\hat {\mathbf {x}}_{{ }_E }} = (\mathbf {N} + \mathbf {EE}^T)^{ - 1} - \mathbf {E}({\mathbf {E}}^T\mathbf {E})^{ - 2}{\mathbf {E}}^T\) and NE = 0 follows that \({\mathbf {Q}}_{\hat {\mathbf {x}}_{{ }_E }} \mathbf {N} = (\mathbf {N} +\)EET)−1N and \(\mathbf {NQ}_{\hat {\mathbf {x}}_{{ }_E }} = \mathbf {N}(\mathbf {N} + \mathbf {EE}^T)^{ - 1}\). From property (72) for C = E and \({\mathbf {Q}}_C = {\mathbf {Q}}_{{\hat {\mathbf {x}}_E }}\) we obtain \({\mathbf {Q}}_{\hat {\mathbf {x}}_E } \mathbf {N} = \mathbf {I} - \mathbf {E}({\mathbf {E}}^T\mathbf {E})^{ - 1}{\mathbf {E}}^T\) and \(\mathbf {NQ}_{\hat {\mathbf {x}}_E } = \mathbf {I} - \mathbf {E}({\mathbf {E}}^T\mathbf {E})^{ - 1}{\mathbf {E}}^T\). Therefore both \(\mathbf {NQ}_{\hat {\mathbf {x}}_E } \) and \({\mathbf {Q}}_{\hat {\mathbf {x}}_E } \mathbf {N}\) are symmetric and (G3) and (G4) are also satisfied, for P = I and R = I. Since all four properties are satisfied \({\mathbf {Q}}_{\hat {\mathbf {x}}_E } \) is a pseudoinverse of N
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{\hat{\mathbf{x}}_E } = {\mathbf{N}}^ +, \end{aligned} $$
(101)
with respect to the norm ||x||2 = xTx in Rm.
For the matrix A = (N + EET)−1ATP, appearing in the inner constraints solution \(\hat {\mathbf {x}}_E = {\mathbf {A}}^- \mathbf {b} + \mathbf {E}({\mathbf {E}}^T\mathbf {E})^{ - 1}\mathbf {d}\), we note that AA = (N + EET)−1N = I −E(ETE)−1ET is symmetric with the last term following from property (63) with C = E and its transposition. Thus property (G4) is satisfied with R = I. In addition PAA = PA(N + EET)−1ATP is also symmetric and property (G3) is satisfied. (G1) is satisfied, since AE = 0 gives AAA = A[I −E(ETE)−1ET] = A. It also holds that
$$\displaystyle \begin{aligned} {\mathbf{A}}^- \mathbf{AA}^- = (\mathbf{N} + \mathbf{EE}^T)^{ - 1}{\mathbf N} (\mathbf{N} + \mathbf{EE}^T)^{ - 1}{\mathbf{A}}^T\mathbf{P} \end{aligned}$$
while from (91)
$$\displaystyle \begin{aligned} (\mathbf{N} + \mathbf{EE}^T)^{ - 1}\mathbf{N}(\mathbf{N} + \mathbf{EE}^T)^{ - 1} = (\mathbf{N} + \mathbf{EE}^T)^{ - 1} - \mathbf{E}({\mathbf{E}}^T\mathbf{E})^{ - 2}{\mathbf{E}}^T \end{aligned}$$
and since ETAT = 0 we get
$$\displaystyle \begin{aligned} {\mathbf{A}}^- \mathbf{AA}^- = (\mathbf{N} + \mathbf{EE}^T)^{ - 1}{\mathbf{A}}^T\mathbf{P} - \mathbf{E}({\mathbf{E}}^T\mathbf{E})^{ - 2}{\mathbf{E}}^T{\mathbf{A}}^T\mathbf{P} = (\mathbf{N} + \mathbf{EE}^T)^{ - 1}{\mathbf{A}}^T\mathbf{P} = {\mathbf{A}}^- \end{aligned}$$
and property (G2) is also satisfied. Since A = (N + EET)−1ATP satisfies all four properties it is a pseudoinverse of A
$$\displaystyle \begin{aligned} (\mathbf{N} + \mathbf{EE}^T)^{ - 1}{\mathbf{A}}^T\mathbf{P} = {\mathbf{A}}^{+},\quad {\hat{\mathbf x}_E}={\mathbf{A}}^ + {\mathbf b}, \end{aligned} $$
(102)
with respect to the norm ||y||2 = yTPy in Rn and the norm ||x||2 = xTx in Rm.
Recalling that in statistics estimable parameters within the linear model are defined as those having unbiased estimates, we may formally take the expectation of the minimal constraints solution, taking into account that d = CTx and E{b} = Ax, we get
$$\displaystyle \begin{aligned} E\{\hat{\mathbf{x}}_C \} = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}(\mathbf{AP}E\{\mathbf{b}\} + \mathbf{Cd}) = (\mathbf{N} + \mathbf{CC}^T)^{ - 1}(\mathbf{Nx} + \mathbf{CC}^T\mathbf{x}) = \mathbf{x}, \end{aligned} $$
(103)
and so we have managed to get an unbiased estimate of a non-estimable quantity! To resolve this paradox we notice that estimability was established using the minimal constraints relation CTx = d, so it seems that coordinates are non-estimable quantities in the model b = Ax + v, but estimable in the joint model b = Ax + v, CTx = d. From the physical point of view, coordinates are estimable when the constraints are CTx = d hold true in a physical sense and are not merely a computational device for obtaining one out of the infinitely many least squares solutions \(\hat {\mathbf {x}}\) of the model b = Ax + v, v ∼ (0, σ2P−1). Even in this case, one may counter argue that the arbitrary chosen minimal constraints CTx = d define a reference system and once this system is chosen the coordinates with respect to it become estimable. The problem though is that this reference system remains physically inaccessible, because we cannot physically access its origin and direction of its axes on the basis of the available estimates \(\hat {\mathbf {x}}_C \). The reason is that the latter define an estimated network shape, which differs from the unknown true one. Fitting parts of this estimated shape to corresponding parts of the real network, will lead to a different realization of the reference system, depending on which minimal parts we choose to fit. This is a general problem with reference frames, a term which in geodesy refers to the realization of the reference system by means by a set of estimated coordinates (plus velocities for deformable networks). To give an example of minimal constraints that have a physical meaning and lead to estimable coordinates, consider a classical planar network where angles and distances have been observed. The three minimal constraints XA = YA = 0, YB = 0 define a reference system with origin at network point A and first axis in the direction of the line AB. For any other point P, the coordinate YP is the distance of P from the line AB and the coordinate XP is the distance AP, where P is the projection of P on the line AB. But these quantities refer solely to the geometric form of the network and are independent of any choice of reference system. Therefore they are determinable-estimable quantities and the estimates \(\hat {X}_P \), \(\hat {Y}_P \) are indeed unbiased, also because the introduced constraints have defined a reference system with physically accessible origin and axes.
In a more general setup, we may reorder the coordinates \(\mathbf {x} = \left [ \begin {array}{c} {{\mathbf {x}}_1 } \\ {{\mathbf {x}}_2 } \end {array} \right ]\) in the model \(\mathbf {b} = \left [ {{\mathbf {A}}_1 \, {\mathbf {A}}_2 } \right ]\left [ \begin {array}{c} {{\mathbf {x}}_1 } \\ {{\mathbf {x}}_2 } \end {array} \right ] + \mathbf {v}\), so that the d constraints x2 = 0 are minimal constraints of the form \({\mathbf {C}}^T\mathbf {x} = \left [ {\mathbf {0}\ {\mathbf {I}}_d } \right ]\left [ \begin {array}{c} {{\mathbf {x}}_1 } \\ {{\mathbf {x}}_2 } \end {array} \right ] = \mathbf {0}\). We shall call constraints of this type, or its generalization x2 = c2, where c2 are fixed known values, as trivial constraints. The estimates can be also obtained by replacing \({\mathbf {C}}^T = \left [ {\mathbf {0}\ {\mathbf {I}}_d } \right ]\), d = c2 in the general solution, but it is easier to eliminate the fixed coordinates x2 = c2 and work with the reduced model b −A2c2 = A1x1 + v to obtain the estimates \(\hat {\mathbf {x}}_1 = ({\mathbf {A}}_1^T \mathbf {PA}_1 )^{ - 1}{\mathbf {A}}_1^T \mathbf {P}(\mathbf {b} - {\mathbf {A}}_2 {\mathbf {c}}_2 )\) while \(\hat {\mathbf {x}}_2 = {\mathbf {c}}_2 \), with covariance factor matrices \({\mathbf {Q}}_{\hat {\mathbf {x}}_1 } = ({\mathbf {A}}_1^T \mathbf {PA}_1 )^{ - 1}\), \({\mathbf {Q}}_{\hat {\mathbf {x}}_2 } = \mathbf {0}\), \({\mathbf {Q}}_{\hat {\mathbf {x}}_1 \hat {\mathbf {x}}_2 } = \mathbf {0}\). This is a special case of a general parameter elimination technique based on the minimal constraints \({\mathbf {C}}^T\mathbf {x} = \left [ {{\mathbf {C}}_1^T \quad {\mathbf {C}}_2^T } \right ]\left [ \begin {array}{c} {{\mathbf {x}}_1 } \\ {{\mathbf {x}}_2 } \end {array} \right ] = {\mathbf {C}}_1^T {\mathbf {x}}_1 + {\mathbf {C}}_2^T {\mathbf {x}}_2 = \mathbf {d}\), where by parameter reordering the d × d matrix \({\mathbf {C}}_2^T \) is non-singular. Solving for \({\mathbf {x}}_2 = {\mathbf {C}}_2^{ - T} \mathbf {d} - {\mathbf {C}}_2^{ - T} {\mathbf {C}}_1^{ T} {\mathbf {x}}_1 \) and replacing in b = A1x1 + A2x2 + v we obtain the reduced model \(\bar {\mathbf {b}} \equiv \mathbf {b} - {\mathbf {A}}_2 {\mathbf {C}}_2^{ - T} \mathbf {d} = ({\mathbf {A}}_1 - {\mathbf {A}}_2 {\mathbf {C}}_2^{ - T} {\mathbf {C}}_1^T ){\mathbf {x}}_1 + \mathbf {v} \equiv \bar {\mathbf {A}}_1 {\mathbf {x}}_1 + \mathbf {v}\) without rank defect, having least squares solution
$$\displaystyle \begin{aligned} &{} \hat{\mathbf{x}}_1 = (\bar{\mathbf{A}}_1^T \mathbf{P}\bar{\mathbf{A}}_1 )^{ - 1}\bar{\mathbf{A}}_1^T \mathbf{P}\bar{\mathbf{b}} \equiv \bar{\mathrm{N}}_1^{ - 1} \bar{\mathbf{u}},\quad {\mathbf{Q}}_{\hat{\mathbf{x}}_1 } = \bar{\mathbf{N}}_1^{ - 1} , \end{aligned} $$
(104)
$$\displaystyle \begin{aligned} & \hat{\mathbf{x}}_2 = {\mathbf{C}}_2^{ - T} \mathbf{d} - {\mathbf{C}}_2^{ - T} {\mathbf{C}}_1^T \hat{\mathbf{x}}_1 ,\qquad {\mathbf{Q}}_{\hat{\mathbf{x}}_2 } = {\mathbf{C}}_2^{ - T} {\mathbf{C}}_1^T {\mathbf{Q}}_{\hat{\mathbf{x}}_1 } {\mathbf{C}}_1 {\mathbf{C}}_2^{ - 1} ,\\ &{\mathbf{Q}}_{\hat{\mathbf{x}}_1 \hat{\mathbf{x}}_2 } = - {\mathbf{C}}_2^{ - T} {\mathbf{C}}_1^T {\mathbf{Q}}_{\hat{\mathbf{x}}_1 } .\end{aligned} $$
(105)
Inner constraints minimize the norm of all parameters x, which in addition to the network coordinates they may include other parameters related to the observational process. By proper ordering let x1 denote the network coordinates or, more generally, a subset of the parameters of immediate interest. Then we may minimize the seminorm \(\vert \vert {\mathbf {x}}_1 \vert \vert ^2 = {\mathbf {x}}_1^T {\mathbf {x}}_1 \) instead of the norm ||x||2 = xTx. This is achieved by replacing the inner constraints \({\mathbf {E}}^T\mathbf {x} = \left [ {{\mathbf {E}}_1^T \quad {\mathbf {E}}_2^T } \right ]\left [ \begin {array}{c} {{\mathbf {x}}_1 } \\ {{\mathbf {x}}_2 } \end {array} \right ] = {\mathbf {E}}_1^T {\mathbf {x}}_1 + {\mathbf {E}}_2^T {\mathbf {x}}_2 = \mathbf {0}\) with the so called partial inner constraints
$$\displaystyle \begin{aligned} {\mathbf{C}}^T\mathbf{x} = \left[ {{\mathbf{E}}_1^T \quad \mathbf{0}} \right]\left[ \begin{array}{c} {{\mathbf{x}}_1 } \\ {{\mathbf{x}}_2 } \end{array} \right] = {\mathbf{E}}_1^T {\mathbf{x}}_1 = \mathbf{0}.\end{aligned} $$
(106)
To see this, consider the linearized coordinate transformation \({\hat {\mathbf {x}}^{\prime }} = \hat {\mathbf {x}} + \mathbf {Ep}\), which splits into \({\hat {\mathbf {x}}^{\prime }}_1 = \hat {\mathbf {x}}_1 + {\mathbf {E}}_1 \mathbf {p}\) and \({\hat {\mathbf {x}}^{\prime }}_2 = \hat {\mathbf {x}}_2 + {\mathbf {E}}_2 \mathbf {p}\). Minimizing \(\phi = ({\hat {\mathbf {x}}^{\prime }}_1 )^T{\hat {\mathbf {x}}^{\prime }}_1 =\)\((\hat {\mathbf {x}}_1 + {\mathbf {E}}_1 \mathbf {p})^T(\hat {\mathbf {x}}_1 + {\mathbf {E}}_1 \mathbf {p})\) by setting \(\frac {\partial \phi }{\partial \mathbf {p}} = 2(\hat {\mathbf {x}}_1 + {\mathbf {E}}_1 \mathbf {p})^T{\mathbf {E}}_1 = \mathbf {0}\), we obtain \(\mathbf {p} = - ({\mathbf {E}}_1^T {\mathbf {E}}_1 )^{ - 1}{\mathbf {E}}_1^T \hat {\mathbf {x}}_1 \) and conversion to the partial inner constraints solution satisfying E1x1 = 0 takes the form
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}} &= \hat{\mathbf{x}} - \mathbf{E}({\mathbf{E}}_1^T \mathbf{E})^{ - 1}{\mathbf{E}}_1^T \hat{\mathbf{x}}_1 = \hat{\mathbf{x}} - \mathbf{E}\left( {\left[ \begin{array}{cc} {{\mathbf{E}}_1^T } & \mathbf{0} \end{array} \right]\left[ \begin{array}{c} {{\mathbf{E}}_1 } \\ {{\mathbf{E}}_2 } \end{array} \right]} \right)^{ - 1}\left[ \begin{array}{cc} {{\mathbf{E}}_1^T } & \mathbf{0} \end{array} \right]\left[ \begin{array}{c} {\hat{\mathbf{x}}_1} \\ {\hat{\mathbf{x}}_2} \end{array} \right]=\\ & = [\mathbf{I} - \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{ - 1}{\mathbf{C}}^T]\hat{\mathbf{x}},\end{aligned} $$
(107)
with \({\mathbf {C}}^T = \left [ {{\mathbf {E}}_I^T \quad \mathbf {0}} \right ]\). The partial inner constraints solution \(\hat {\mathbf {x}}_{EP} \) follows easily from the general minimal constraints solution by replacing \({\mathbf {C}}^T = \left [ {{\mathbf {E}}_I^T \quad \mathbf {0}} \right ]\)
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_{EP} &= \left[ \begin{array}{cc} {{\mathbf{N}}_{11} + {\mathbf{E}}_1 {\mathbf{E}}_1^T } & {{\mathbf{N}}_{12} } \\ {{\mathbf{N}}_{12}^T } & {{\mathbf{N}}_{22} } \end{array} \right]^{ - 1}\left[ \begin{array}{c} {{\mathbf{u}}_1 + {\mathbf{E}}_1 \mathbf{d}} \\ {{\mathbf{u}}_2 } \end{array} \right] =\\ &=\left[ \begin{array}{cc} {{\mathbf{N}}_{11} + {\mathbf{E}}_1 {\mathbf{E}}_1^T } & {{\mathbf{N}}_{12} } \\ {{\mathbf{N}}_{12}^T } & {{\mathbf{N}}_{22} } \end{array} \right]^{ - 1}\left[ \begin{array}{c} {{\mathbf{u}}_1 } \\ {{\mathbf{u}}_2 } \end{array} \right],\end{aligned} $$
(108)
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{\hat{\mathbf{x}}_{EP} } &= \left[ \begin{array}{cc} {{\mathbf{N}}_{11} + {\mathbf{E}}_1 {\mathbf{E}}_1^T } & {{\mathbf{N}}_{12} } \\ {{\mathbf{N}}_{12}^T } & {{\mathbf{N}}_{22} } \end{array} \right]^{ - 1}\left[ \begin{array}{cc} {{\mathbf{N}}_{11} } & {{\mathbf{N}}_{12} } \\ {{\mathbf{N}}_{12}^T } & {{\mathbf{N}}_{22} } \end{array} \right]\left[ \begin{array}{cc} {{\mathbf{N}}_{11} + {\mathbf{E}}_1 {\mathbf{E}}_1^T } & {{\mathbf{N}}_{12} } \\ {{\mathbf{N}}_{12}^T } & {{\mathbf{N}}_{22} } \end{array} \right]^{ - 1} = \\ &= \left[ \begin{array}{cc} {{\mathbf{N}}_{11} + {\mathbf{E}}_1 {\mathbf{E}}_1^T } & {{\mathbf{N}}_{12} } \\ {{\mathbf{N}}_{12}^T } & {{\mathbf{N}}_{22} } \end{array} \right]^{ - 1} - \left[ \begin{array}{cc} {{\mathbf{E}}_1 ({\mathbf{E}}_1^T {\mathbf{E}}_1 )^{ - 2}{\mathbf{E}}_1^T } & {{\mathbf{E}}_1 ({\mathbf{E}}_1^T {\mathbf{E}}_1 )^{ - 2}{\mathbf{E}}_2^T } \\ {{\mathbf{E}}_2 ({\mathbf{E}}_1^T {\mathbf{E}}_1 )^{ - 2}{\mathbf{E}}_1^T } & {{\mathbf{E}}_2 ({\mathbf{E}}_1^T {\mathbf{E}}_1 )^{ - 2}{\mathbf{E}}_2^T} \end{array} \right].\end{aligned} $$
(109)
A generalization of the inner constraints are the minimum weighted norm constraints or generalized inner constraints [42] where the minimized quantity is the square distance \(\vert \vert \mathbf {x} - {\mathbf {x}}_{ref} \vert \vert _{\mathbf {W}}^2 = (\mathbf {x} - {\mathbf {x}}_{ref} )^T\mathbf {W}(\mathbf {x} - {\mathbf {x}}_{ref} )\) from a fixed known value xref. These can be used to adapt the solution to a preexisting solution xref, assigning different weights to different points according to their importance, or to different coordinates, e.g., by downloading the less accurate vertical components with respect to the horizontal ones. The solution can be derived from the linearized transformation \(\hat {\mathbf {x}}_G = \hat {\mathbf {x}} + \mathbf {Ep}\), where \(\hat {\mathbf {x}}\) is any least squares solution, by minimizing \(\phi = (\mathbf {x} - {\mathbf {x}}_{ref} )^T\mathbf {W}(\mathbf {x} - {\mathbf {x}}_{ref} ) = (\hat {\mathbf {x}} + \mathbf {Ep} - {\mathbf {x}}_{ref} )^T\mathbf {W}(\hat {\mathbf {x}} + \mathbf {Ep} - {\mathbf {x}}_{ref} )\). Setting gives the parameter values
$$\displaystyle \begin{aligned} \mathbf{p} = - ({\mathbf{E}}^T\mathbf{WE})^{ - 1}{\mathbf{E}}^T\mathbf{W}(\hat{\mathbf{x}} - {\mathbf{x}}_{ref} ) \end{aligned}$$
for which the desired solution becomes
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_G = [\mathbf{I} - \mathbf{E}({\mathbf{E}}^T\mathbf{WE})^{ - 1}{\mathbf{E}}^T\mathbf{W}]\hat{\mathbf{x}} + \mathbf{E}({\mathbf{E}}^T\mathbf{WE})^{ - 1}{\mathbf{E}}^T\mathbf{Wx}_{ref} , \end{aligned} $$
(110)
Comparing with the general transformation (88) from one least squares solution to another, we conclude that the same solution can be obtained by minimal constraints CTx = d, if we choose CT = ETW and d = ETWxref, i.e., with the generalized inner constraints
$$\displaystyle \begin{aligned} {\mathbf{E}}^T\mathbf{W}(\mathbf{x} - {\mathbf{x}}_{ref} ) = \mathbf{0}. \end{aligned} $$
(111)
In analogy with the usual inner constraints we may consider also partial generalized inner constraints where \(\vert \vert {\mathbf {x}}_1 - {\mathbf {x}}_1^{{ }_{ref} } \vert \vert _{{\mathbf {W}}_1 }^2 = ({\mathbf {x}}_1 - {\mathbf {x}}_1^{ref} )^T{\mathbf {W}}_1 ({\mathbf {x}}_1 - {\mathbf {x}}_1^{{ }_{ref} } )\) is minimized. From the split transformation \({\hat {\mathbf {x}}^{\prime }}_1 = \hat {\mathbf {x}}_1 + {\mathbf {E}}_1 \mathbf {p}\) and \({\hat {\mathbf {x}}^{\prime }}_2 = \hat {\mathbf {x}}_2 + {\mathbf {E}}_2 \mathbf {p}\) we may minimize
$$\displaystyle \begin{aligned} \phi = ({\hat{\mathbf{x}}^{\prime}}_1 )^T{\hat{\mathbf{x}}^{\prime}}_1 = (\hat{\mathbf{x}}_1 + {\mathbf{E}}_1 \mathbf{p} - {\mathbf{x}}_1^{ref} )^T{\mathbf{W}}_1 (\hat{\mathbf{x}}_1 + {\mathbf{E}}_1 \mathbf{p} - {\mathbf{x}}_1^{ref} ), \end{aligned} $$
(112)
setting \(\frac {\partial \phi }{\partial \mathbf {p}} = 2(\hat {\mathbf {x}}_1 + {\mathbf {E}}_1 \mathbf {p} - {\mathbf {x}}_1^{ref} )^T{\mathbf {W}}_1 {\mathbf {E}}_1 = \mathbf {0}\) to obtain \(\mathbf {p} = - ({\mathbf {E}}_1^T {\mathbf {W}}_1 {\mathbf {E}}_1 )^{ - 1}{\mathbf {E}}_1^T {\mathbf {W}}_1\)\((\hat {\mathbf {x}}_1 - \hat {\mathbf {x}}_1^{ref} )\) so that the conversion to the partial inner constraints solution takes the form \({\hat {\mathbf {x}}^{\prime }}_1 = \left [ {\mathbf {I} - {\mathbf {E}}_1 ({\mathbf {E}}_1^T {\mathbf {W}}_1 {\mathbf {E}}_1 )^{ - 1}{\mathbf {E}}_1^T {\mathbf {W}}_1 } \right ]\hat {\mathbf {x}}_1 + {\mathbf {E}}_1 ({\mathbf {E}}_1^T {\mathbf {W}}_1 {\mathbf {E}}_1 )^{ - 1}{\mathbf {E}}_1^T {\mathbf {W}}_1 {\mathbf {x}}_1^{ref} \), \({\hat {\mathbf {x}}^{\prime }}_2 = \hat {\mathbf {x}}_2 - {\mathbf {E}}_2 ({\mathbf {E}}_1^T {\mathbf {W}}_1 {\mathbf {E}}_1 )^{ - 1}{\mathbf {E}}_1^T {\mathbf {W}}_1 (\hat {\mathbf {x}}_1 - {\mathbf {x}}_1^{ref} )\) which can be combined into
$$\displaystyle \begin{aligned} &{\hat{\mathbf{x}}^{\prime}} = \hat{\mathbf{x}} - \mathbf{E}({\mathbf{E}}_I^T {\mathbf{W}}_1 {\mathbf{E}}_1 )^{ - 1}{\mathbf{E}}_1^T {\mathbf{W}}_1 \hat{\mathbf{x}}_1 + \mathbf{E}({\mathbf{E}}_1^T {\mathbf{W}}_1 {\mathbf{E}}_1 )^{ - 1}{\mathbf{E}}_I^T {\mathbf{W}}_1 {\mathbf{x}}_1^{ref} = \\ & = \hat{\mathbf{x}} - \mathbf{E}\left( {\left[ \begin{array}{c@{\quad }c} {{\mathbf{E}}_1^T {\mathbf{W}}_1 } & \mathbf{0} \end{array} \right]\mathbf{E}} \right)^{ - 1}\left[ \begin{array}{c@{\quad }c} {{\mathbf{E}}_1^T {\mathbf{W}}_1 } & \mathbf{0} \end{array} \right]\hat{\mathbf{x}} + \mathbf{E}\left( {\left[ \begin{array}{c@{\quad }c} {{\mathbf{E}}_1^T {\mathbf{W}}_1 } & \mathbf{0} \end{array} \right]\mathbf{E}} \right)^{ - 1}{\mathbf{E}}_1^T {\mathbf{W}}_1 {\mathbf{x}}_1^{ref} . \end{aligned} $$
(113)
Comparison with the general conversion (88) from one least squares equation to the other shows that the desired solution can be obtained also from the minimal constraints solution setting \({\mathbf {C}}^T = \left [ \begin {array}{cc} {{\mathbf {E}}_1^T {\mathbf {W}}_1 } & \mathbf {0} \end {array} \right ]\) and \(\mathbf {d} = {\mathbf {E}}_1^T {\mathbf {W}}_1 {\mathbf {x}}_1^{ref} \), i.e., by the generalized partial inner constraints
$$\displaystyle \begin{aligned} {\mathbf{E}}_1^T {\mathbf{W}}_1 ({\mathbf{x}}_1 - {\mathbf{x}}_1^{ref} ) = \mathbf{0}. \end{aligned} $$
(114)
The idea of inner constraints and free network originates in the work of Meissl [45, 46, 47]. It has been further elaborated by Blaha [15, 16], Grafarend and Schaffrin [38, 39], Baarda [11, 12] and many others. Precursors of these ideas were already present in the work of Bjerhammer [14] who in 1951 rediscovered the pseudoinverse of Moore [48], know widely known as the Moore-Penrose inverse, before Penrose in 1955 [50]. For more details we suggest [13, 28, 41, 56].

6 Mathematical Modeling of Spatiotemporal Reference Systems for a Deformable Geodetic Network: Deterministic Aspects and Reference System Optimality

In the case of a rigid network with time-invariant geometric configuration the reference system needs to be chosen just once and remains the same for any time instant t. In contrast, in the case of a deformable network, a reference system needs to be chosen for every instant t, in correspondence with a time-varying geometric configuration. Thus the choice of a spatial reference system for a rigid network is replaced by the choice of a spatiotemporal reference system in the case of a deforming network.

Before attacking the actual problem of the choice of the reference system within the process of data analysis, it is necessary to examine the reference system choice within a deterministic model, where we assume that the geometric configuration of the network is known for all epochs t within some time interval. In the simplest case, the temporal variation of the geometric configuration is continuous and the choice of a reference system for each epoch t should lead to its representation by a continuous function x(t), where x contains the coordinates of all the n network points. The more realistic case of non-continuous deformation with discontinuities associated with seismic events can be similarly treated in a piecewise manner for intervals between seismic discontinuities. For each epoch, a coordinate transformation \(\tilde {\mathbf {x}}_i (t) = \mathbf {t}({\mathbf {x}}_i (t),\mathbf {p}(t))\) leads to an equally valid continuous representation \(\tilde {\mathbf {x}}(t)\), provided that the transformation parameter functions p(t) are continuous (Fig. 2). For each fixed epoch all such representations \(\tilde {\mathbf {x}}(t)\) constitute a shape manifoldSt in R3n, corresponding to all coordinates that give the same geometric configuration of the network at the particular epoch t. The set of all shape manifoldsSt, for all t in the considered interval, constitutes a fibering of the subset ⋃tSt ⊂ R3n, provided that for tt, i.e., that we exclude the pathological cases where the network returns to a previous geometric \(S_t \cap S_{t}^{\prime } = \O \) configuration or that remains rigid for some time interval. Before the introduction of the (now spatiotemporal) reference system, the deterministic model for the temporal evolution of the geometric configuration of the network consists of the time-dependent collection of shape manifolds St for \(t \in \left [ {t_F ,t_L } \right ]\), where tF and tL are the initial and final epochs, respectively, of the time interval to which our analysis is confined. The choice of a spatiotemporal reference system becomes now the choice of a curve C in ⋃tSt ⊂ R3n, which intersects each shape manifold St at a unique point x(t), i.e., C ∩ St = {x(t)}. In this context the question of what is the optimal choice of a reference system arises in a natural way. Mathematically, if xref(t) is some arbitrarily chosen reference system, the problem reduces to that of finding the optimal parameter functions p(t), which transform the reference system choice xref(t) into the optimal one x(t) = t(xref(t), p(t)). Here t : (xref, p) →x stands for the transformation from the coordinates of the reference to those of the optimal reference system defined pointwise by the rigid transformation
$$\displaystyle \begin{aligned} {\mathbf{x}}_i (t) = \mathbf{R}(\boldsymbol{\uptheta }(t)){\mathbf{x}}_{i,ref} (t) + \mathbf{d}(t),\quad i = 1,2,\ldots ,n, \end{aligned} $$
(115)
where the transformation parameters \(\mathbf {p}(t) = \left [ {\boldsymbol {\uptheta }(t)^T\,\mathbf {d}(t)^T} \right ]^T\) consist of the rotation parameters θ(t) of the orthogonal matrix R(t) and the displacement vector d(t). The most obvious optimality characteristic is that the optimal choice x(t) does not exhibit coordinate variations that come from the choice of the reference system itself and do not reflect actual variations in the geometric configuration of the network. In plane words the station coordinates should vary as little as possible! The remaining problem is how to quantify coordinate variation. In the framework of differential geometry this optimality characteristic is satisfied by requiring that the curve x(t) is a geodesic joining two points \(\mathbf {x}(t_F ) \in S_{t_F } \) and \(\mathbf {x}(t_L ) \in S_{t_L } \). The geodesic property requires a choice of metric in R3n, where ⋃tSt is embedded, which we choose to be the simple Euclidean metric
$$\displaystyle \begin{aligned} \rho (\mathbf{x},\mathbf{{x}^{\prime}}) = \vert \vert \mathbf{{x}^{\prime}} - \mathbf{x}\vert \vert_E = \sqrt{(\mathbf{{x}^{\prime}} - \mathbf{x})^T(\mathbf{{x}^{\prime}} - \mathbf{x})} = \sqrt{\sum_i {(\mathbf{{x}^{\prime}}_i - \mathbf{{x}^{\prime}})^T(\mathbf{{x}^{\prime}}_i ,{\mathbf{x}}_i )} } . \end{aligned} $$
(116)
This does not completely solve the problem because there exists a geodesic for any such arbitrary pair of endpoints. We must further require that x(t) is the shortest possible geodesic between all points \(\mathbf {x}(t_F ) \in S_{t_F } \) and \(\mathbf {x}(t_L ) \in S_{t_L } \).
Fig. 2

A reference system x(t) is a section C of the fibering F of the network coordinate space R3n, with fibers the shape manifolds of the network at various epochs t. An optimal reference system \(\tilde {\mathbf {x}}(t)\) is a section \(\tilde {C}\) (a geodesic curve in R3n) intersecting perpendicularly all shape fibers

If we fix x(tF) and let x(tL) vary within StL, among all geodesics between x(tF) and the various x(tL), there will be one having the shortest possible length \(L_{\mathbf {x}(t_F )} \), which depends only on x(tF). The question is whether there is a particular shortest length among all possible \(L_{\mathbf {x}(t_F )} \), as x(tF) varies over \(S_{t_F } \). It turns out, as we will see below, that all \(L_{\mathbf {x}(t_F )} \) are equal and there is therefore an infinite number of shortest geodesics between \(S_{t_F } \) and \(S_{t_L } \), all with the same length, which all provide optimal spatiotemporal reference systems.

For a geodesic between x(tF) and \(S_{t_L } \) to be shortest, it must hold that it is perpendicular to \(S_{t_L } \), which means that the tangent vector \(\dot {\mathbf {x}}(t_L )\) must be orthogonal to the linear manifold \(T_{\mathbf {x}(t_L )} (S_{t_L } )\), tangent to the nonlinear manifold \(S_{t_L } \). By the reciprocal argument, \(\dot {\mathbf {x}}(t_F )\) must be orthogonal to the linear manifold \(T_{\mathbf {x}(t_F )} (S_{t_F } )\), tangent to the nonlinear manifold \(S_{t_F } \). Since the choice of initial tF and final epoch tL is rather arbitrary, they may replaced by any two epochs t1 and t2 in the interval [tF, tL]. Orthogonality must therefore hold at x(t1) and x(t2), and hence at any point x(t) on the curve. To explicitly express the orthogonality relation \(\dot {\mathbf {x}}(t) \bot T_{\mathbf {x}(t)} (S_t )\), we need a basis for the tangent linear manifold to St at a point x(t), which is provided by the local tangent vectors
$$\displaystyle \begin{aligned} {\mathbf{e}}_i (\mathbf{x}(t)) = \frac{\partial {\textbf{t}}}{\partial p_i }(\mathbf{x}(t)), \,i = 1, \cdots ,6, \end{aligned} $$
(117)
so that
$$\displaystyle \begin{aligned} T_{\mathbf{x}(t)} (S_t ) = \mathrm{span}\left\{ {\frac{\partial \mathbf{t}}{\partial p_i }\left( {\mathbf{x}(t)} \right)} \right\}. \end{aligned} $$
(118)
Expressing the orthogonality \(\dot {\mathbf {x}}(t) \bot T_{\mathbf {x}(t)} (S_t )\) by the vanishing of the Euclidean inner product, it turns out that the optimal reference system x(t) = t(xref(t), p(t)) is provided by the solution p(t) to the set of differential equations
$$\displaystyle \begin{aligned} \dot{\mathbf{x}}(t)^T\left[ {\frac{\partial \mathbf{t}}{\partial p_i }(\mathbf{x}(t))} \right] = 0,\qquad i = 1,\ldots 6,\quad t \in [t_F , \,t_L ]. \end{aligned} $$
(119)
Regarding optimality, we must note that if pf is a fixed (time independent) set of transformation parameters, then the coordinate systems x(t) and \(\tilde {\mathbf {x}}(t) = \mathbf {t}\left ( {\mathbf {x}(t),{\mathbf {p}}_f } \right )\) are kinematically equivalent, since they are connected at any epoch by a fixed set of transformation parameters and thus exhibit the same coordinate variation. We will characterize such reference systems as parallel.

Definition 1

Two spatiotemporal reference systems will be called parallel, if the values of the parameters of the transformation from one to the other are constant for all epochs within the time interval under consideration.

For two parallel spatiotemporal reference systems connected through x(t) = Rx(t) + d, where R and d are constant, the tangent vectors are related by \(\frac {d\mathbf {{x}^{\prime }}}{dt}(t) = \mathbf {R}\frac {d\mathbf {x}}{dt}(t)\). Using the relation between the tangent vector magnitude and the length element ds it follows that
$$\displaystyle \begin{aligned} \left( {\frac{d{s}^{\prime}}{dt}} \right)^2 = \left( {\frac{d\mathbf{{x}^{\prime}}}{dt}} \right)^T\left( {\frac{d\mathbf{{x}^{\prime}}}{dt}} \right) = \left( {\frac{d\mathbf{x}}{dt}} \right)^T{\mathbf{R}}^T\mathbf{R}\left( {\frac{d\mathbf{x}}{dt}} \right) = \left( {\frac{d\mathbf{x}}{dt}} \right)^T\left( {\frac{d\mathbf{x}}{dt}} \right) = \left( {\frac{ds}{dt}} \right)^2 \end{aligned}$$
and therefore the length of elements of the two system curves are equal ds = ds. Consequently, the lengths L and L of the two shortest geodesics between \(S_{t_F } \) and \(S_{t_L } \), beginning from x(tF) and \(\tilde {\mathbf {x}}(t_F )\), respectively, are equal since \({L}^{\prime } = \int \nolimits _{t_F }^{t_L } {d{s}^{\prime }} =\)\(\int \nolimits _{t_F }^{t_L } {ds} = L\).

The differential equations (119) defining the optimal solution have infinite solutions, each particular one depending on integration constants, which may be the chosen initial values x(tF). Different choices of x(tF) lead to different optimal solutions, which are parallel in the above defined sense. Therefore, the reference system at some particular reference epoch must be arbitrarily chosen, while optimality criteria determine only the temporal evolution of the reference system, i.e., the reference system at any subsequent epoch tF < t ≤ tL. This is true for the case when rigid transformations are allowed between reference systems, keeping the scale fixed, but it does not hold for the case of similarity transformations. The reason lies in the choice of the Euclidean metric as a means of quantification of coordinate variation, while the choice of an arbitrarily small scale factor λ(t) = 1 + s(t) can make the length of the resulting shortest geodesic arbitrarily small, so that the shortest geodesic becomes indeterminable. For the similarity transformation case \(\tilde {\mathbf {x}}(t) = \left ( {1 + s(t)} \right )\mathbf {R}(t)\mathbf {x}(t) + \mathbf {d}(t)\), a different metric is needed, which is invariant, not only with respect to rotation and translation, but also with respect to change of scale.

We restrict ourselves here to the standard case where network scale is already defined, with a realization based on a set of atomic clocks, so that only rigid transformations of displacement and rotation are allowed between the different possible reference systems. We assume that some reference solution x(t) is already established and we seek a solution \(\tilde {\mathbf {x}}(t)\), which is optimal, i.e., a shortest geodesic. The rigid transformation for each network point i has the form \(\tilde {\mathbf {x}}_i = \mathbf {R}(\boldsymbol {\uptheta }){\mathbf {x}}_i + \mathbf {d}\), where dependence on time is omitted for the sake of simplicity, and we seek to find the optimal functions θ(t), d(t) satisfying the orthogonality conditions (119) and thus providing an optimal solution. The tangent vectors to the shape manifold are the columns of the matrices \(\frac {\partial \tilde {\mathbf {x}}}{\partial \mathbf {d}}\), \(\frac {\partial \tilde {\mathbf {x}}}{\partial \boldsymbol {\uptheta }}\) and the orthogonality condition takes the form
$$\displaystyle \begin{aligned} \dot{\tilde{\mathbf{x}}}^T\frac{\partial \tilde{\mathbf{x}}}{\partial \mathbf{d}} = \sum_{i = 1}^n {\dot{\tilde{\mathbf{x}}}_i^T } \frac{\partial \tilde{\mathbf{x}}_i }{\partial \mathbf{d}} = \mathbf{0},\qquad \dot{\tilde{\mathbf{x}}}^T\frac{\partial \tilde{\mathbf{x}}}{\partial \boldsymbol{\uptheta }} = \sum_{i = 1}^n {\dot{\tilde{\mathbf{x}}}_i^T } \frac{\partial \tilde{\mathbf{x}}_i }{\partial \boldsymbol{\uptheta }} = \mathbf{0}. \end{aligned} $$
(120)
Introducing the notation
$$\displaystyle \begin{aligned} \left[ {\boldsymbol{\upomega}_k \times } \right] = \mathbf{R}\frac{\partial {\mathbf{R}}^T}{\partial \theta_k },\qquad k = 1,2,3\qquad \boldsymbol{\Omega} = [\boldsymbol{\upomega}_1 \boldsymbol{\upomega}_2 \boldsymbol{\upomega}_3 ], \end{aligned} $$
(121)
we can easily show that \(\dot {\mathbf {R}}{\mathbf x}_i = \left [ {(\mathbf {Rx}_i )\times } \right ]\boldsymbol {\Omega } \boldsymbol {\dot {\uptheta }}\) so that \(\dot {\tilde {\mathbf {x}}}_i = \dot {\mathbf {R}}{\mathbf {x}}_i + \mathbf {R}\dot {\mathbf {x}}_i + \dot {\mathbf {d}}\) becomes \(\dot {\tilde {\mathbf {x}}}_i = \left [ {(\mathbf {Rx}_i )\times } \right ]\boldsymbol {\Omega } \boldsymbol {\dot {\uptheta }} + \mathbf {R}\dot {\mathbf {x}}_i + \dot {\mathbf {d}}\), while \(\frac {\partial \tilde {\mathbf {x}}_i }{\partial \mathbf {d}} = {\mathbf {I}}_3 \) and \(\frac {\partial (\tilde {\mathbf {x}}_i )}{\partial \boldsymbol {\uptheta }} = \frac {\partial (\mathbf {Rx}_i )}{\partial \boldsymbol {\uptheta }} =\) [(Rxi)×] Ω. For the proof note that (121) implies that \(\frac {\partial \mathbf {R}}{\partial \theta _k } = - [\boldsymbol {\upomega }_k \times ]\mathbf {R}\) so that \(\dot {\mathbf {R}}{\mathbf {x}}_i = - \sum \nolimits _k {\dot {\uptheta }_k [\boldsymbol {\upomega }_k \times ]\mathbf {Rx}_i } = [(\mathbf {Rx}_i )\times ]\sum \nolimits _k {\dot {\theta }_k \boldsymbol {\upomega }_k } = [(\mathbf {Rx}_i )\times ]\boldsymbol {\Omega } \boldsymbol {{\dot \uptheta }}\). With these values and setting
$$\displaystyle \begin{aligned} &\mathbf{m} = \frac{1}{n}\sum_{i = 1}^n {{\mathbf{x}}_i } ,\quad \dot{\mathbf{m}} = \frac{1}{n}\sum_{i = 1}^n {\dot{\mathbf{x}}_i } , \\ &\mathbf{C} = - \sum_{i = 1}^n {[({\mathbf{x}}_i - \mathbf{m})\times ]^2} = - \sum_{i = 1}^n {[{\mathbf{x}}_i \times ]^2} + n[\mathbf{m}\times ]^2, \\ {} &\mathbf{h} = \sum_{i = 1}^n {[({\mathbf{x}}_i - \mathbf{m})\times ](\dot{\mathbf{x}}_i - \dot{\mathbf{m}})} = \sum_{i = 1}^n {[{\mathbf{x}}_i \times ]\dot{\mathbf{x}}_i } - n[\mathbf{m}\times ]\dot{\mathbf{m}}, \end{aligned} $$
(122)
the orthogonality conditions take the form
$$\displaystyle \begin{aligned} &\frac{1}{n}\left( {\dot{\tilde{\mathbf{x}}}^T\frac{\partial \dot{\mathbf{x}}}{\partial \mathbf{d}}} \right)^T = \frac{1}{n}\sum_{i = 1}^n {\left( {\frac{\partial \dot{\mathbf{x}}_i }{\partial \mathbf{d}}} \right)^T\dot{\tilde{\mathbf{x}}}_i = \mathbf{R}\left[ {\mathbf{m}\times } \right]{\mathbf{R}}^T\boldsymbol{\Omega} \dot{\boldsymbol{\uptheta }} + \mathbf{R}} \dot{\mathbf{m}} + \dot{\mathbf{d}} = \mathbf{0}, \end{aligned} $$
(123)
$$\displaystyle \begin{aligned} &{\mathbf{R}}^T{\boldsymbol{\Omega} }^{ - T}\left( {\dot{\tilde{\mathbf{x}}}^T\frac{\partial \dot{\mathbf{x}}}{\partial \mathbf{d}}} \right)^T = \sum_{i = 1}^n {\left( {\frac{\partial \dot{\mathbf{x}}_i }{\partial \boldsymbol{\uptheta }}} \right)}^T\,\dot{\tilde{\mathbf{x}}}_i=\\ &\qquad = \mathbf{CR}^T\boldsymbol{\Omega} \dot{\boldsymbol{\uptheta }} - n[\mathbf{m}\times ]^2{\mathbf{R}}^T\boldsymbol{\Omega} \dot{\boldsymbol{\uptheta}} - \mathbf{h} - n[\mathbf{m}\times ]\dot{\mathbf{m}} - n[\mathbf{m}\times ]{\mathbf{R}}^T\dot{\mathbf{d}} = \mathbf{0}. \end{aligned} $$
(124)
Solving the first for \(\dot {\mathbf {m}} = - [\mathbf {m}\times ]{\mathbf {R}}^T\boldsymbol {\Omega } \dot {\boldsymbol {\uptheta }} - {\mathbf {R}}^T\dot {\mathbf {d}}\) and replacing in the second we finally arrive at the desired system of nonlinear differential equations \(\dot {\mathbf {d}} =- \mathbf {R}[\mathbf {m}\times ]{\mathbf {R}}^T\boldsymbol {\Omega } \dot {\boldsymbol {\uptheta }} - \mathbf {R}\dot {\mathbf {m}}\), \(\dot {\boldsymbol {\uptheta }} = \boldsymbol {\Omega }^{ - 1}\mathbf {RC}^{ - 1}\mathbf {h}\), or after replacing the second into the first,
$$\displaystyle \begin{aligned} &\dot{\mathbf{d}} = - \mathbf{R}(\boldsymbol{\uptheta })\dot{\mathbf{m}} - \mathbf{R}(\boldsymbol{\uptheta })[\mathbf{m}\times ]{\mathbf{C}}^{ - 1}\mathbf{h} \end{aligned} $$
(125)
$$\displaystyle \begin{aligned} &\dot{\boldsymbol{\uptheta }} = {\boldsymbol{\Omega} }(\boldsymbol{{\uptheta }})^{ - 1}\mathbf{R}(\boldsymbol{{\uptheta }}){\mathbf{C}}^{ - 1}\mathbf{h}.\end{aligned} $$
(126)
These differential equations have infinitely many solutions, depending on the initial values (integration constants) d0 = d(tF), θ0 = θ(tF). Each choice leads to one of the infinitely possible shortest geodesics with different starting point \(\tilde {\mathbf {x}}_t (t_F ) =\mathbf {R}(\boldsymbol {\uptheta }_0 ){\mathbf {x}}_i + {\mathbf {d}}_0 \) on the initial shape manifold \(S_{t_F } \). Different choices of θ0, d0, and hence of \(\tilde {\mathbf {x}}(t_F )\), correspond to reference systems that are parallel to each other, in the (already defined above) sense of being connected with time-invariant transformation parameters.
In the special case that the reference solution x(t) has been chosen in a way that h = 0 and \(\dot {\mathbf {m}} = \mathbf {0}\) the differential equations degenerate into \(\dot {\mathbf {d}} = \mathbf {0}\), \(\boldsymbol {\dot {\uptheta }} = \mathbf {0}\) with constant solution d = d0, θ = θ0. This means that the general solution is parallel to any special solution with h = 0 and \(\dot {\mathbf {m}} = \mathbf {0}\). In other words the conditions h = 0 and \(\dot {\mathbf {m}} = \mathbf {0}\) are sufficient in providing one of the desired shortest geodesics. To formally prove this statement we will start with an arbitrary solution x(t) and we will seek the functions d(t), θ(t) leading through xi = R(θ)xi + d to a solution x(t), which satisfies h = 0 and \({\dot {\mathbf {m}}^{\prime }} = \mathbf {0}\), and see whether this is an optimal solution or not. Obviously \({\dot {\mathbf {m}}^{\prime }} = \dot {\mathbf {R}}\mathbf {m} + \mathbf {R}\dot {\mathbf {m}} + \dot {\mathbf {d}}\), which with \(\dot {\mathbf {R}}\mathbf {m} = [(\mathbf {Rm})\times ]\,{\boldsymbol {\Omega } }\,\dot {\boldsymbol {\uptheta }}\) leads to
$$\displaystyle \begin{aligned} {\dot{\mathbf{m}}^{\prime}} = [(\mathbf{Rm})\,\times ]\,{\boldsymbol{\Omega} }\,\dot{\boldsymbol{\uptheta}} + \mathbf{R}\dot{\mathbf{m}} + \dot{\mathbf{d}}. \end{aligned} $$
(127)
Replacing \({\dot {\mathbf {x}}^{\prime }}_i - {\dot {\mathbf {m}}^{\prime }} = \mathbf {R}[({\mathbf {x}}_i - \mathbf {m})\times ]{\mathbf {R}}^T\,{\boldsymbol {\Omega } }\,\dot {\boldsymbol {\uptheta }} + \mathbf {R}(\dot {\mathbf {x}}_i - \dot {\mathbf {m}})\) and \({\dot {\mathbf {x}}^{\prime }}_i - {\dot {\mathbf {m}}^{\prime }} = \mathbf {R}({\mathbf {x}}_i - \mathbf {m})\), h becomes
$$\displaystyle \begin{aligned} \mathbf{{h}^{\prime}} &= \sum_{i = 1}^n {[(\mathbf{{x}^{\prime}}_i - \mathbf{{m}^{\prime}})\times ](\dot{\mathbf{x}}^{\prime}_i - \dot{\mathbf{m}}^{\prime})}=\\ &= \mathbf{R}\left( {\sum_{i = 1}^n {[({\mathbf{x}}_i - \mathbf{m})\times ]^2} } \right){\mathbf{R}}^T\boldsymbol{\Omega} \dot{\boldsymbol{\uptheta }} + \mathbf{R}\left( {\sum_{i = 1}^n {[({\mathbf{x}}_i - \mathbf{m})\times ]} (\dot{\mathbf{x}}_i - \dot{\mathbf{m}})} \right)= \\ &= - \mathbf{RCR}^T\boldsymbol{\Omega} \dot{\boldsymbol{\uptheta}} + \mathbf{Rh}. \end{aligned} $$
(128)
Setting \({\dot {\mathbf {m}}^{\prime }} = \mathbf {0}\) and h = 0 we arrive at \(\dot {\mathbf {d}} = - \dot {\mathbf {R}}\mathbf {m} - [(\mathbf {Rm})\times ]\,{\boldsymbol {\Omega } }\,\dot {\boldsymbol {\uptheta }}\) and \(\,\dot {\boldsymbol {\uptheta }} = {\boldsymbol {\Omega } }^{ - 1}\mathbf {RC}^{ - 1}\mathbf {h}\), respectively. Replacing the second in the first we arrive at the differential equations \(\dot {\mathbf {d}} = - \dot {\mathbf {R}}\mathbf {m} - \mathbf {R}[\mathbf {m}\,\times ]{\mathbf {C}}^{ - 1}\,\mathbf {h}\), \(\,\boldsymbol {\dot {\uptheta }} = {\boldsymbol {\Omega } }^{ - 1}\mathbf {RC}^{ - 1}\mathbf {h}\), which are identical to the ones (125) and (126) providing an optimal solution. In conclusion, the conditions \({\dot {\mathbf {m}}^{\prime }} = \mathbf {0}\) and h = 0 lead equivalently to an optimal solution (shortest geodesic) as the one defined directly by the orthogonality condition (119).
To understand the importance of the above conclusion, we must turn to the solution of the choice of reference system problem, which has been proposed in geophysics, not for a discrete geodetic network, but for the continuous mass distribution of the earth. The origin of the reference system is taken to be the geocenter, which in an arbitrary reference system has coordinates defined at every epoch by
$$\displaystyle \begin{aligned} {\mathbf{x}}_G (t) = \frac{1}{M(t)}\int_{E(t)}\mathbf{x}(t)dm(t) \end{aligned} $$
(129)
where dm(t) is the mass element, integration is taken over all the earth masses and \(M(t) = \int \nolimits _{E(t)} {dm(t)} \) is the total mass of the earth. The condition xG(t) = 0 for a geocentric reference system settles the issue of defining the origin system. For the orientation of the axes the French astronomer Felix Tisserand (1845–1896) has introduced the concept of what we now call Tisserand axes [49, 61]. In order to minimize the apparent motion of the earth masses with respect to the optimal reference system, he has chosen to minimize the relative kinetic energy of the earth \(T_R (t) = \textstyle {1 \over 2}\int \nolimits _{E(t)} {\dot {\mathbf {x}}(t)^T\dot {\mathbf {x}}(t)dm(t)} \). It turns out that \(T_R (t) = \min \) is equivalent to the vanishing of the relative angular momentum of the earth
$$\displaystyle \begin{aligned} {\mathbf{h}}_R (t) = \int\nolimits_{E(t)} {[\mathbf{x}(t)\times ]\dot{\mathbf{x}}\,(t)\,dm(t) = \mathbf{0}.} \end{aligned} $$
(130)
The above condition does not define uniquely the orientation of the reference system. Indeed a time-independent rotation x(t) = R0x(t) leads to a relative angular momentum \(\mathbf {{h}^{\prime }}_R = \int \nolimits _E {[({\mathbf {R}}_0 \mathbf {x})\times ]({\mathbf {R}}_0 \dot {\mathbf {x}})\, dm = {\mathbf {R}}_0 \mathbf {h} = \mathbf {0}} \), which also vanishes. There is therefore an infinite set of Tisserand reference systems, all of them being parallel in the already explained sense. A most general Tisserand system can be obtained by relaxing the strict “geocentricity” requirement xG(t) = 0, with xG(t) = constant (equivalently \(\dot {\mathbf {x}}_G (t) = \mathbf {0})\). In this case the geocentric relative angular momentum will be given instead by
$$\displaystyle \begin{aligned} {\mathbf{h}}_R = \int\nolimits_E {[(\mathbf{x} - {\mathbf{x}}_G )\times ]\,(\dot{\mathbf{x}} - \dot{\mathbf{x}}_G )\,dm = \mathbf{0}.} \end{aligned} $$
(131)
The ideas of Tisserand can be applied to the case of a discrete geodetic network if we consider the network points as mass points with equal mass, which we may take to be unity without loss of generality. In such a case we must replace integration with summation, so that with the replacements
$$\displaystyle \begin{aligned} \frac{1}{M}\int\nolimits_E {( \cdot )dm} &\to \frac{1}{n}\sum_{i = 1}^n {( \cdot )} , \,\mathbf{x} \to {\mathbf{x}}_i , {\mathbf{x}}_G \to \mathbf{m}\frac{1}{n}\sum_{i = 1}^n {{\mathbf{x}}_i } , \mathrm{and}\\ {\mathbf{h}}_R &\to \mathbf{h} = \frac{1}{n}\sum_{i = 1}^n {[({\mathbf{x}}_i } - \mathbf{m})\times ]\,(\dot{\mathbf{x}}_i - \dot{\mathbf{m}}) \end{aligned} $$
the conditions \(\dot {\mathbf {x}}_G (t) = \mathbf {0}\), hR = 0 are replaced by \(\dot {\mathbf {m}} = \mathbf {0}\) and h = 0. But these are exactly the conditions we have obtained as necessary for producing spatiotemporal reference systems which are shortest geodesics! Therefore, our theory for discrete point networks is just a special case of Tisserand’s choice of reference system (slightly generalized with respect to the origin choice). The connection of the two theories is an important result with practical consequences, since they lead to appropriate minimal constraints for application in data analysis, the so called kinematic constraints [1, 2]. These lead to an optimal choice of the spatiotemporal reference system, with minimized station motions, which best represents the temporal evolution of the network shape. For further details see [23, 24, 25].

7 Reference System Definition in the Analysis of Coordinate Time Series

Observational data, related to the geometric configuration (shape) of a geodetic network at the epoch of observation, are collected by four space techniques GPS/GNSS [35], VLBI [59], SLR [51] and DORIS [62]. From the viewpoint of formal statistical optimality, all these data should be analyzed simultaneously. At first sight, this appears to be a herculean task. It is however possible to combine the normal equations, derived from different subsets of data, into a new set that provides the combined solution. Computational difficulties are not therefore a prohibitive factor. There are two levels in dividing the data into subsets. At the higher level, data from different space techniques are analyzed separately. At the lower level, data from a particular space technique, collected over a long time interval, are separated into subsets corresponding to smaller time intervals of one day, or one week, or the few days of the duration of a VLBI session. Thus, single epoch solutions become available as a first step of data analysis.

Returning to the question of statistical optimality, we must realize that it holds under the particular assumptions that observation errors are zero mean random variables and that the covariance matrix of all observations is known up to a scale factor. These assumptions however are not valid in a real world situation. The presence of systematic errors and correlations between data in different epochs, are ignored in the separate per epoch analysis. Therefore, the analysis of the data into subsets may have some advantages other than the obvious computational convenience. Such an advantage is easier to realize in the presence of outliers, which are difficult to detect in the analysis of a large data set, where the effect of even a single outlier spreads over a large number of error estimates. Treating data in subsets enhances the possibility of detecting “bad data” and it is at least a necessary preprocessing step in the overall data analysis. Physical considerations can also help in detecting bad data. For example, the fact that a network deforms slowly allows the detection of bad epoch solutions which step out in the produced coordinate series in a way that is highly improbable to arise from the effect of random errors.

Following the standard practice, we will start with the analysis of coordinate time series, which are the result of per epoch data analysis. The estimated network shape is represented through a set of station coordinate estimates obtained with the use of a particular set of (hopefully minimal) constraints. Thus every epoch has its own independently chosen reference system. The task of the next step, which is usually called stacking is twofold: the first is to achieve a smoothing interpolation of the discrete series, through a coordinate evolution model of the general form xi(t) = fi(t, a) where a is a set of unknown model coefficients; the second is to choose an optimal reference system for the description of the interpolated continuous temporal variation of the network shape. The usual choice is a linear-in-time coordinate variation or constant velocity model xi(t) = x0i + (t − t0)vi where x0i = xi(t0) are the initial coordinates and vi the velocity of station i. The station coordinate estimates \(\hat {\mathbf {x}}_i (t_k )\) at epoch tk as treated as pseudo-observations \({\mathbf {x}}_i^{ob} (t_k )\), which are expressed as functions of the coordinates xi(t) in a spatiotemporal reference system common for all epochs
$$\displaystyle \begin{aligned} {\mathbf{x}}_i^{ob} (t_k)= (1 + s_k )\mathbf{R}(\boldsymbol{\uptheta }_k ){\mathbf{x}}_i (t_k ) + {\mathbf{d}}_k+{\mathbf{e}}_{ik} = (1 + s_k )\mathbf{R}(\boldsymbol{\uptheta }_k )\left[ {{\mathbf{x}}_{i0} + (t_k - t_0 ){\mathbf{v}}_i } \right] + {\mathbf{d}}_k + {\mathbf{e}}_{ik} . \end{aligned} $$
(132)
Here θk, dk, sk are the transformation parameters from the new (not yet defined) reference system to that of epoch tk and eik are the observational errors. As usual linearization, based on the approximation R(θk) ≈I − [θk×], leads to a sufficient approximate linear model of the form
$$\displaystyle \begin{aligned} {\mathbf{x}}_i^{ob} (t_k ) &= {\mathbf{x}}_{i0} + (t_k - t_0 ){\mathbf{v}}_i + s_k {\mathbf{x}}_{0i}^{ap} + [{\mathbf{x}}_{0i}^{ap} \times ]\boldsymbol{\uptheta }_k + {\mathbf{d}}_k + {\mathbf{e}}_{ik} =\\ &= {\mathbf{x}}_{i0} + (t_k - t_0 ){\mathbf{v}}_i+ \left[ {[{\mathbf{x}}_{0i}^{ap} \times ]\quad {\mathbf{I}}_3 \quad {\mathbf{x}}_{0i}^{ap} } \right]\left[ \begin{array}{c} {\boldsymbol{\uptheta }_k } \\ {{\mathbf{d}}_k } \\ {s_k } \end{array} \right] + {\mathbf{e}}_{ik} =\\ &\equiv {\mathbf{x}}_{i0} + (t_k - t_0 ){\mathbf{v}}_i + {\mathbf{E}}_i {\mathbf{z}}_k + {\mathbf{e}}_{ik} \qquad i = 1,2,\ldots ,n,\quad k = 1,2,\ldots ,m, \end{aligned} $$
(133)
where \({\mathbf {x}}_{0i}^{ap} \) are approximate values to the initial coordinates xi0. Alternatively, the model can be expressed in terms of corrections to approximate values δx0i = \({\mathbf {x}}_{0i} - {\mathbf {x}}_{0i}^{ap} \), \(\delta {\mathbf {v}}_i = {\mathbf {v}}_i - {\mathbf {v}}_i^{ap} \), \(\delta \boldsymbol {\uptheta }_k = \boldsymbol {\uptheta }_k - \boldsymbol {\uptheta }_k^{ap} \), \(\delta {\mathbf {d}}_k = {\mathbf {d}}_k - {\mathbf {d}}_k^{ap} \), \(\delta s_k = s_k - s_k^{ap} \) as
$$\displaystyle \begin{aligned} \Delta {\mathbf{x}}_i^{ob} (t_k ) & = \delta {\mathbf{x}}_{i0} + (t_k - t_0 )\delta {\mathbf{v}}_i + \left[ {\left[ {{\mathbf{x}}_{0i}^{ap} \times } \right]\quad {\mathbf{I}}_3 \quad \,{\mathbf{x}}_{0i}^{ap} } \right]\left[ \begin{array}{c} {\delta \boldsymbol{\uptheta }_k } \\ {\delta {\mathbf{d}}_k } \\ {\delta s_k } \end{array} \right] + {\mathbf{e}}_k=\\ &= \delta {\mathbf{x}}_{i0} + (t_k - t_0 )\delta {\mathbf{v}}_i + {\mathbf{E}}_i \delta {\mathbf{z}}_k + {\mathbf{e}}_{ik} , \end{aligned} $$
(134)
where
$$\displaystyle \begin{aligned} \Delta {\mathbf{x}}_i^{ob} (t_k ) = {\mathbf{x}}_i^{ob} (t_k ) - {\mathbf{x}}_{0i}^{ap} - (t_k - t_0 ){\mathbf{v}}_i^{ap} - s_k^{ap} {\mathbf{x}}_{0i}^{ap} - [{\mathbf{x}}_{0i}^{ap} \times ]\boldsymbol{\uptheta }_k^{ap} - {\mathbf{d}}_k^{ap} , \end{aligned} $$
(135)
are the reduced observations. The form (134) in term of corrections is more appropriate when iterations are performed using the estimates of each step as approximate values in the next one, until convergence. This solves the least squares estimation problem for the original nonlinear model.

For purely geodetic purposes, the parameters of interest are the initial coordinates xi0 and the velocities vi, while the transformation parameters \({\mathbf {z}}_k = [\boldsymbol {\uptheta }_k^T {\mathbf {d}}_k^T s_k ]^T\) are nuisance parameters. However, their estimates are needed in order to convert available estimates of Earth Orientation Parameters (EOPs) from the reference system at epoch tk to the new common reference system. EOPs connect the celestial reference system with the adopted terrestrial reference system at epoch tk and are different for each space technique. We will first look into the stacking problem of coordinate time series without taking EOP coordinate time series into consideration.

The inclusion of a separate sk at every epoch tk does not conform with our basic deterministic and stochastic model, since observations from all space techniques are not invariant under scale transformations and have their own scale provided by their unit of length. This in turn is just the product of the constant velocity of light with their corresponding unit of time, which is realized by a particular set of atomic clocks. At most one could assume a linear drift in these clocks, which could be accommodated by a linear model \(s_k = s_0 + (t - t_0 )\,\dot {s}\) arriving at an observations model where only initial scale s0 = s(t0) and scale rate \(\dot {s}\) appear as unknown parameters. The inclusion of different scale factors sk is though a standard practice that has proven to be effective. This is so for two reasons. The first has to do with the fact that the assumption of zero mean random errors on which the least squares estimation is based is not really valid. In addition to the random errors which will cause random variations in the per epoch estimated network shape, various stochastic effects will cause corresponding shape variations. The use of a different scale parameter per epoch “absorbs” a part of these variations. The second reason is that although the observations have no absolute rank defect with respect to scale, they may be in a close to rank defect situation. Although observations are not absolutely invariant under scale transformation their variation may be very small and thus negligible. It is known from computational experience that only VLBI and SLR provide observations with strong scale information, while scale information is weak in GNSS and DORIS. For this reason their scale is not taken into consideration when combining data from the four techniques in order to formulate the ITRF.

Such close to rank deficiency situations apply also to translation and rotation transformations. VLBI has an absolute rank deficiency with respect to translation. SLR strongly senses the geocenter through the satellite orbits and thus has no translation defect or weakness, when the geocenter is used as origin of the reference system. GPS and DORIS also sense the geocenter but in a much weaker way, so that their origin/translational information is of much lower quality. It is therefore necessary to be able to detect deficiencies and weaknesses attributed to the definition of the reference system a priori in a preprocessing stage. The number of deficiencies and weaknesses is mirrored in the number of eigenvalues of the normal equations matrix N of the solution at each epoch, which have (practically) zero or close to zero values. It is not clear though, whether these are indeed associated with defects in the reference system definition, or to which one of the particular possible types of defect (origin, orientation, scale) they can be attributed. The reason is that the corresponding eigenvectors ui have no physical meaning. On the contrary, the columns of the inner constraint matrix \(\mathbf {E} = [{\mathbf {e}}_{\theta _1 } {\mathbf {e}}_{\theta _2 } {\mathbf {e}}_{\theta _3 } {\mathbf {e}}_{d_1 } {\mathbf {e}}_{d_2 } {\mathbf {e}}_{d_3 } {\mathbf {e}}_s ]\) are associated with orientation components, translation components and scale. In fact they form a basis of the null space N(N) of N, which is identical to the null space N(A) of the design matrix A. Combining eigenvectors ui with zero or very small eigenvalues λi with the rows of E, [21] have introduced three appropriate rank deficiency interpretation indices for the detection of defects or weaknesses in reference system definition. The first index
$$\displaystyle \begin{aligned} \omega_{Q,i} {=} \arctan \sqrt{\frac{{\mathbf{u}}_i^T {\mathbf{u}}_i }{{\mathbf{u}}_i^T {\mathbf{E}}_Q ({\mathbf{E}}_Q^T {\mathbf{E}}_Q )^{ - 1}{\mathbf{E}}_Q^T {\mathbf{u}}_i } - 1} {=} \arcsin \sqrt{1 {-} \frac{{\mathbf{u}}_i^T {\mathbf{E}}_Q ({\mathbf{E}}_Q^T {\mathbf{E}}_Q )^{ - 1}{\mathbf{E}}_Q^T {\mathbf{u}}_i }{{\mathbf{u}}_i^T {\mathbf{u}}_i }} ,\end{aligned} $$
(136)
is the angle between any eigenvector ui and a subspace of N(N) = N(A) spanned by a subset of the columns of E. For example \({\mathbf {E}}_Q = [{\mathbf {e}}_{\theta _1 } {\mathbf {e}}_{\theta _2 } {\mathbf {e}}_{\theta _3 } ]\) seeks to see whether a zero or close to zero eigenvalue λi (Nui = λiui) can be attributed to an orientation defect of the reference system. The second index
$$\displaystyle \begin{aligned} \psi_q = \arctan \sqrt{\frac{{\mathbf{e}}_q^T {\mathbf{e}}_q }{{\mathbf{e}}_q^T \mathbf{U}({\mathbf{U}}^T\mathbf{U})^{ - 1}{\mathbf{U}}^T{\mathbf{e}}_q } - 1} = \arcsin \sqrt{1 - \frac{{\mathbf{e}}_q^T \mathbf{U}({\mathbf{U}}^T\mathbf{U})^{ - 1}{\mathbf{U}}^T{\mathbf{e}}_q }{{\mathbf{e}}_q^T {\mathbf{e}}_q }} , \end{aligned} $$
(137)
where U = [⋯ui⋯ ] is the matrix with columns the eigenvectors with zero or very small eigenvalues, is the angle between one of the columns of E and spanU, the set of all linear combinations of the columns of U. The third index is the pair
$$\displaystyle \begin{aligned} \chi_{Q,\min } = \sqrt{1 - \mu_{\min } } ,\qquad \chi_{Q,\max } = \sqrt{1 - \mu_{\max } } , \end{aligned} $$
(138)
where \(\mu _{\min } \) and \(\mu _{\max } \) are the largest and smaller eigenvalue of the matrix \({\mathbf {U}}^T{\mathbf {E}}_Q ({\mathbf {E}}_Q^T {\mathbf {E}}_Q )^{ - 1}{\mathbf {E}}_Q^T \mathbf {U}\). They are the minimum and maximum angle between any vector in spanEQ ⊂ N(N) and any vector in spanU. The indexes also apply to the extended case of
$$\displaystyle \begin{aligned} \mathbf{E} = [{\mathbf{e}}_{\theta_1^0 } {\mathbf{e}}_{\theta_2^0 } {\mathbf{e}}_{\theta_3^0 } {\mathbf{e}}_{d_1^0 } {\mathbf{e}}_{d_2^0 } {\mathbf{e}}_{d_3^0 } {\mathbf{e}}_{s_0}\ \vert\ {\mathbf{e}}_{\dot{\theta }_1^0 } {\mathbf{e}}_{\dot{\theta }_2^0 } {\mathbf{e}}_{\dot{\theta }_3^0 } {\mathbf{e}}_{\dot{d}_1^0} {\mathbf{e}}_{\dot{d}_2^0} {\mathbf{e}}_{\dot{d}_3^0} {\mathbf{e}}_{\dot{s}}], \end{aligned} $$
(139)
for reference system deficiencies in original epoch orientation (\(\theta _1^0 , \theta _2^0 ,\theta _3^0 )\), origin (\(d_1^0 , d_2^0 ,d_3^0 )\) and scale (s0), as well in orientation rate (\(\dot {\theta }_1^0 , \dot {\theta }_2^0 ,\dot {\theta }_3^0 )\), origin rate (\(\dot {d}_1 , \dot {d}_2 , \dot {d}_3 )\) and scale rate (\(\dot {s})\), as we will see below.

The above indexes help in understanding the quality of informational content of each space technique, related to the reference system definition. They are also useful in depicting isolated bad epochs, which do not confront with the overall deficiency characteristics of the time series and should not be included in the stacking solution.

Even if no deficiencies or weaknesses are determined with the above indices with respect to translation and rotation, the relevant parameters θk, dk are included in the stacking solution in order to remove to some degree systematic effects, as done in the case of the scale parameters sk.

The least squares stacking adjustment using the above linearized observations model leads to normal equations with rank defect accepting infinitely many least squares estimates \(\hat {\mathbf {x}}_{0i} \), \(\hat {\mathbf {v}}_i \), i = 1, 2, …, n and \(\hat {\mathbf {z}}_k \), k = 1, 2, …, m. Each solution corresponds to a different choice of the spatiotemporal reference system. Numerical computations reveal that the normal equations have 14 practically zero eigenvalues, corresponding to a rank defect of 14. Therefore 14 minimal constraints must be introduced. It is thus essential to determine the inner constraints matrixE, appearing in the linearized transformation of the total unknowns x = x + Ep, with transformation parameters \(\mathbf {p} = \left [ {\boldsymbol {\psi }^T{\mathbf {c}}^T\lambda } \right ]^T\) (rotation angles ψ, translations c, scale parameter λ). The problem here lies in the fact that a change of the spatiotemporal reference system is a change of the reference system at every epoch. The transformation from the instantaneous coordinates xi(t) = x0i + (t − t0)vi, into instantaneous coordinates
$$\displaystyle \begin{aligned} \mathbf{{x}^{\prime}}_i (t) &= [1 + \lambda (t)]\mathbf{R}(\boldsymbol{\uppsi }(t))[{\mathbf{x}}_{0i} + (t - t_0 ){\mathbf{v}}_i ] + \mathbf{c}(t)\approx\\ & \approx {\mathbf{x}}_{0i} + (t - t_0 ){\mathbf{v}}_i + [{\mathbf{x}}_{0i}^{ap} ]\boldsymbol{\uppsi }(t) + \mathbf{c}(t) + \lambda (t){\mathbf{x}}_{0i}^{ap} , \end{aligned} $$
(140)
reveals two problems. Firstly, the resulting coordinate model is not linear-in-time xi(t)≠x0i + (t − t0)vi for some parameters x0i, vi. Secondly the transformation parameters, as functions ψ(t), c(t), λ(t), are infinite in number! Even if one restricts the analysis to only the discrete epochs tk, k = 1, 2, …, m, in the per epoch transformations
$$\displaystyle \begin{aligned} \mathbf{{x}^{\prime}}_i (t_k ) \approx {\mathbf{x}}_{0i} + (t_k - t_0 ){\mathbf{v}}_i + [{\mathbf{x}}_{0i}^{ap} \times ]\boldsymbol{\uppsi }(t_k ) + \mathbf{c}(t_k ) + \lambda (t_k ){\mathbf{x}}_{0i}^{ap} , \end{aligned} $$
(141)
there are 7 m transformation parameters involved, much more than the numerically detected rank deficiency of 14. A simple minded approach is to arbitrarily restrict the transformation functions to ones that are linear-in-time, namely
$$\displaystyle \begin{aligned} \boldsymbol{\uppsi }(t) = \boldsymbol{\uppsi }_0 + (t - t_0 )\dot{\boldsymbol{\psi }},\quad \mathbf{c}(t) = {\mathbf{c}}_0 + (t - t_0 )\dot{\mathbf{c}},\quad \lambda (t) = \lambda_0 + (t - t_0 )\dot{\lambda }, \end{aligned} $$
(142)
so that the resulting xi(t) is of the second degree with respect to (t − t0) and then take advantage of the fact that the coefficient of (tt0)2 turns out to be a negligible quantity of the second order. This non-rigorous approach implements the correct number of transformation parameters, i.e., the 14 parameters ψ0, c0, λ0, \(\dot {\boldsymbol {\psi }}\), \(\dot {\mathbf {c}}\), \(\dot {\lambda }\).

A rigorous solution to this problem has been presented by Chatzinikos and Dermanis [20] who have directly investigated the rank deficiency of the design matrix of the stacking problem and have revealed the relation between any two least squares solutions. Their results are summarized into two propositions:

Proposition 1

If\(\hat {\mathbf {x}}_{0i} \) , \(\hat {\mathbf {v}}_i \) , \(\hat {\mathbf {z}}_k \), i = 1, …, n, k = 1, …, m, is one of the least squares solutions of the stacking problem, then\(\hat {\mathbf {x}}_{0i} = \hat {\mathbf {x}}_{0i} + {\mathbf {E}}_i {\mathbf {p}}_0 \) , \(\hat {\mathbf {{v}^{\prime }}}_i = \hat {\mathbf {v}}_i + {\mathbf {E}}_i \dot {\mathbf {p}}\) , \({\hat {\mathbf {z}}^{\prime }}_k =\)\(\hat {\mathbf {z}}_k - {\mathbf {p}}_0 - (t_k - t_0 )\dot {\mathbf {p}}\), is also a least squares solution for any values of the constants\({\mathbf {p}}_0 = [\boldsymbol {\uppsi }_0^T {\mathbf {c}}_0^T \lambda _0 ]^{\mathrm {T}}\) , \(\dot {\mathbf {p}} = [\dot {\boldsymbol {\psi }}^T\,\dot {\mathbf {c}}^T\,\dot {\lambda }]^{\mathrm {T}}\).

Proposition 2

If\(\hat {\mathbf {x}}_{0i} \) , \(\hat {\mathbf {v}}_i \) , \(\hat {\mathbf {z}}_k \)and\({\hat {\mathbf {x}}^{\prime }}_{0i} \) , \({\hat {\mathbf {v}}^{\prime }}_i \) , \({\hat {\mathbf {z}}^{\prime }}_k \)are two least squares solutions of the stacking problem, then there exist constants\({\mathbf {p}}_0 = [\boldsymbol {\uppsi }_0^T \,{\mathbf {c}}_0^T \,\lambda _0 ]^{\mathrm {T}}\) , \(\dot {\mathbf {p}} = [\dot {\boldsymbol {\psi }}^T\,\dot {\mathbf {c}}^T\,\dot {\lambda }]^{\mathrm {T}}\), such that\({\hat {\mathbf {x}}^{\prime }}_{0i} = \hat {\mathbf {x}}_{0i} + {\mathbf {E}}_i {\mathbf {p}}_0 \) , \(\hat {\mathbf {{v}^{\prime }}}_i = \hat {\mathbf {v}}_i + {\mathbf {E}}_i \dot {\mathbf {p}}\), i = 1, …, n, and\({\hat {\mathbf {z}}^{\prime }}_k = \hat {\mathbf {z}}_k - {\mathbf {p}}_0 -\)\((t_k - t_0 )\dot {\mathbf {p}}\), k = 1, …, m.

On the basis of the above propositions we may construct the desired parameter transformation under change of the reference system, having the general form xtot = xtot + Etotptot. Letting x0 contain the station initial coordinates x0i, v contain the station velocities vi, z contain the epoch unknown transformation parameters zk, and E contain the matrices Ei, the desired parameter transformation becomes in the case of stacking
$$\displaystyle \begin{aligned} \mathbf{{x}^{\prime}}_{tot} = \left[ \begin{array}{c} {\mathbf{{x}^{\prime}}_0 } \\ \mathbf{{v}^{\prime}} \\ \mathbf{{z}^{\prime}} \end{array} \right] = {\mathbf{x}}_{tot} + {\mathbf{E}}_{tot} {\mathbf{p}}_{tot} = \left[ \begin{array}{c} {{\mathbf{x}}_0 } \\ \mathbf{v} \\ \mathbf{z} \end{array} \right] + \left[ \begin{array}{cc} \mathbf{E} & \mathbf{0} \\ \mathbf{0} & \mathbf{E} \\ \mathbf{J} & {{\mathbf{J}}_t } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{p}}_0 } \\ \dot{\mathbf{p}} \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{x}}_0 + \mathbf{Ep}_0 } \\ {\mathbf{v} + \mathbf{E}\dot{\mathbf{p}}} \\ {\mathbf{z} + \mathbf{Jp}_0 + {\mathbf{J}}_t \dot{\mathbf{p}}} \end{array} \right],\end{aligned} $$
(143)
where
$$\displaystyle \begin{aligned} \mathbf{E} = \left[ \begin{array}{c} \vdots \\ {{\mathbf{E}}_i } \\ \vdots \end{array} \right],\qquad \mathbf{J} = \left[ \begin{array}{c} \vdots \\ { - {\mathbf{I}}_7 } \\ \vdots \end{array} \right],\qquad {\mathbf{J}}_t = \left[ \begin{array}{c} \vdots \\ { - (t_k - t_0 ){\mathbf{I}}_7 } \\ \vdots \end{array} \right], \end{aligned} $$
(144)
or separately
$$\displaystyle \begin{aligned} \mathbf{{x}^{\prime}}_0 = {\mathbf{x}}_0 + \mathbf{Ep}_0 ,\qquad \mathbf{{v}^{\prime}} = \mathbf{v} + \mathbf{E}\dot{\mathbf{p}},\quad \,\mathbf{{z}^{\prime}} = \mathbf{z} + \mathbf{Jp}_0 + {\mathbf{J}}_t \dot{\mathbf{p}}. \end{aligned} $$
(145)
The transformation parameters are \({\mathbf {p}}_0 = [\boldsymbol {\psi }_0^T \,{\mathbf {c}}_0^T \,\lambda _0 ]^{\mathrm {T}}\) and \(\dot {\mathbf {p}} = [\dot {\boldsymbol {\psi }}^T\,\dot {\mathbf {c}}^T\,\dot {\lambda }]^{\mathrm {T}}\). The most important characteristic of this transformation is the separation between the transformation for initial coordinates and the one of velocities
$$\displaystyle \begin{aligned} \mathbf{{x}^{\prime}}_0 = {\mathbf{x}}_0 + \mathbf{Ep}_0 ,\qquad \mathbf{{v}^{\prime}} = \mathbf{v} + \mathbf{E}\dot{\mathbf{p}}. \end{aligned} $$
(146)
This allows to separately convert any least squares solution \(\hat {\mathbf {x}}_0 \), \(\hat {\mathbf {v}}\), to another solution having desired properties, both with respect to initial coordinates and velocities separately. In addition, the inner constraints matrix Etot can be implemented in order to derive inner or partial inner constraints in their simple or generalized form, as already described in chapter 5.

8 Various Types of Minimal Constraints for the Definition of a Spatiotemporal Reference System

Utilizing the derived transformation matrix
$$\displaystyle \begin{aligned} {\mathbf{E}}_{tot} = \left[ \begin{array}{cc} \mathbf{E} & \mathbf{0} \\ \mathbf{0} & \mathbf{E} \\ \mathbf{J} & {{\mathbf{J}}_t } \end{array} \right], \end{aligned} $$
(147)
the inner constraints\({\mathbf {E}}_{tot}^T {\mathbf {x}}_{tot} = \mathbf {0}\) take the form
$$\displaystyle \begin{aligned} {\mathbf{E}}^T{\mathbf{x}}_0 + {\mathbf{J}}^T\mathbf{z} &= \sum_{i = 1}^n {{\mathbf{E}}_i^T {\mathbf{x}}_{0i} - \sum_{k = 1}^m {{\mathbf{z}}_k = \sum_{i = 1}^n {\left[ \begin{array}{c} { - [{\mathbf{x}}_{0i}^{ap} \times ]} \\ {{\mathbf{I}}_3 } \\ {({\mathbf{x}}_{0i}^{ap} )^T} \end{array} \right]{\mathbf{x}}_{0i} } } } - \sum_{k = 1}^m {\left[ \begin{array}{c} {\boldsymbol{\uptheta }_k } \\ {{\mathbf{d}}_k } \\ {s_k } \end{array} \right]}=\\ &= \left[ \begin{array}{c} { - \sum_{i = 1}^n {[{\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{x}}_{0i} - } \sum_{i = 1}^m {\boldsymbol{\uptheta }_k } } \\ {\sum_{i = 1}^n {{\mathbf{x}}_{0i} - \sum_{i = 1}^m {{\mathbf{d}}_k } } } \\ {\sum_{i = 1}^n {[{\mathbf{x}}_{0i}^{ap} ]^T} {\mathbf{x}}_{0i} - \sum_{i = 1}^m {s_k } } \end{array} \right] = \mathbf{0}, \end{aligned} $$
(148)
$$\displaystyle \begin{aligned} {\mathbf{E}}^T\mathbf{v} + {\mathbf{J}}_t^T \mathbf{z} &= \sum_{i = 1}^n {{\mathbf{E}}_i^T {\mathbf{v}}_i - \sum_{k = 1}^m {(t_k - t_0 ){\mathbf{z}}_k = \sum_{i = 1}^n {\left[ \begin{array}{c} { - [{\mathbf{x}}_{0i}^{ap} \times ]} \\ {{\mathbf{I}}_3 } \\ {({\mathbf{x}}_{0i}^{ap} )^T} \end{array} \right]{\mathbf{v}}_i } } }-\\ &\quad - \sum_{k = 1}^m {(t_k - t_0 )\left[ \begin{array}{c} {\boldsymbol{\uptheta }_k } \\ {{\mathbf{d}}_k } \\ {s_k } \end{array} \right]} =\\ & = \left[ \begin{array}{c} { - \sum_{i = 1}^n {[{\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{v}}_i - } \sum_{k = 1}^m {(t_k - t_0 )\boldsymbol{\uptheta }_k } } \\ {\sum_{i = 1}^n {{\mathbf{v}}_i - \sum_{i = 1}^m {(t_k - t_0 ){\mathbf{d}}_k } } } \\ {\sum_{i = 1}^n {[{\mathbf{x}}_{0i}^{ap} ]^T} {\mathbf{v}}_i - \sum_{i = 1}^m {(t_k - t_0 )s_k } } \end{array} \right] = \mathbf{0}. \end{aligned} $$
(149)
The first 7 constraints define the reference system at the initial epoch and the second 7 its time rate, i.e., its temporal evolution. Of the 7 initial epoch constraints (148) the 3 first define the initial orientation, the next 3 the initial origin and the last one the initial scale. Of the 7 rate constraints (149) the first 3 define the orientation rate, the next 3 the origin rate and the last one the scale rate.
In addition to the above inner constraints \({\mathbf {E}}_{tot}^T {\mathbf {x}}_{tot} = \mathbf {0}\), one may use
  1. (a)

    their generalized version \({\mathbf {E}}_{tot}^T {\mathbf {W}}_{tot} ({\mathbf {x}}_{tot} - {\mathbf {x}}_{tot}^{ref} ) = \mathbf {0}\), satisfying \(({\mathbf {x}}_{tot} - {\mathbf {x}}_{tot}^{ref} )^T\)\({\mathbf {W}}_{tot} ({\mathbf {x}}_{tot} - {\mathbf {x}}_{tot}^{ref} ) = \min \) (weighted minimum distance constraints) and the special cases

     
  2. (b)

    \({\mathbf {E}}_{tot}^T {\mathbf {W}}_{tot} {\mathbf {x}}_{tot} = \mathbf {0}\) satisfying \({\mathbf {x}}_{tot}^T {\mathbf {W}}_{tot} {\mathbf {x}}_{tot} = \min \) (weighted minimum norm constraints) and

     
  3. (c)

    \({\mathbf {E}}_{tot}^T {\mathbf {x}}_{tot} = {\mathbf {E}}_{tot}^T {\mathbf {x}}_{tot}^{ref} \) satisfying \(({\mathbf {x}}_{tot} - {\mathbf {x}}_{tot}^{ref} )^T\,({\mathbf {x}}_{tot} - {\mathbf {x}}_{tot}^{ref} ) = \min \) ( minimum distance constraints).

     
A special case of the last constraints results by choosing \({\mathbf {x}}_{tot}^{ref} = {\mathbf {x}}_{tot}^{ap} \), giving the form \({\mathbf {E}}_{tot}^T ({\mathbf {x}}_{tot} - {\mathbf {x}}_{tot}^{ap} ) = {\mathbf {E}}_{tot}^T \delta {\mathbf {x}}_{tot} = \mathbf {0}\), minimizing the norm of the parameter corrections \(\delta {\mathbf {x}}_{tot}^T \delta {\mathbf {x}}_{tot} = \min \). They have exactly the same form as the inner constraints applied to the parameters, but they apply instead to the parameter corrections. Therefore one may replace the parameters in Eqs. (148) and (149) with their corrections. In general, it is hard to find justification for using a weight matrix other than the identity. In any case, it is of advantage to retain the separation of the constraints into the ones related to initial coordinates and thus defining the reference system at the initial epoch and the ones related to velocities and thus defining the temporal evolution (rate) of the reference system. This is achieved by considering block-diagonal weight matrices of the form
$$\displaystyle \begin{aligned} {\mathbf{W}}_{tot} = \left[ \begin{array}{ccc} {{\mathbf{W}}_{{\mathbf{x}}_0 } } & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & {{\mathbf{W}}_{\mathbf{v}} } & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & {{\mathbf{W}}_{\mathbf{z}} } \end{array} \right], \end{aligned} $$
(150)
which leads to the separable generalized inner constraints
$$\displaystyle \begin{aligned} {\mathbf{E}}_{tot}^T \mathbf{W}({\mathbf{x}}_{tot} - {\mathbf{x}}_{tot}^{ref} ) = \left[ \begin{array}{ccc} {{\mathbf{E}}^T} & \mathbf{0} & {{\mathbf{J}}^T} \\ \mathbf{0} & {{\mathbf{E}}^T} & {{\mathbf{J}}_t^T } \end{array} \right]\left[ \begin{array}{ccc} {{\mathbf{W}}_{{\mathbf{x}}_0 } } & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & {{\mathbf{W}}_{\mathbf{v}} } & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & {{\mathbf{W}}_{\mathbf{z}} } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} } \\ {\mathbf{v} - {\mathbf{v}}^{ref}} \\ {\mathbf{v} - {\mathbf{z}}^{ref}} \end{array} \right] = \mathbf{0}, \end{aligned} $$
(151)
or explicitly
$$\displaystyle \begin{aligned} &{\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } ({\mathbf{x}}_0 - {\mathbf{x}}_0^{ref}) + {\mathbf{J}}^T {\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}, \end{aligned} $$
(152)
$$\displaystyle \begin{aligned} &{\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} (\mathbf{v} - {\mathbf{v}}^{ref}) + {\mathbf{J}}_t^T {\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}. \end{aligned} $$
(153)
A reasonable simplification is to use zref = 0, in which case the initial epoch constraints \({\mathbf {E}}^T{\mathbf {W}}_{{\mathbf {x}}_0 } ({\mathbf {x}}_0 - {\mathbf {x}}_0^{ref} ) + {\mathbf {J}}^T{\mathbf {W}}_{\mathbf {z}} \mathbf {z} = \mathbf {0}.\) can be used to adapt the initial epoch reference system to that of the target initial coordinates \({\mathbf {x}}_0^{ref} \), e.g., those of a previous solution. At the same time the rate constraints ETWv(v −vref) + JTWzz = 0 can be used to adapt the temporal evolution of the reference system to that of the target velocities vref, e.g., those from a previous solution, or velocities provided by a geophysical model.
The above objectives can be best met my partial constraints, especially those which do not involve the transformation parameters. The possible partial inner constraints are the following:
  1. (1)
    involving only initial coordinates and velocities
    $$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{E}}^T} & \mathbf{0} \\ \mathbf{0} & {{\mathbf{E}}^T} \end{array} \right]\left[ \begin{array}{c} {{\mathbf{x}}_0 } \\ \mathbf{v} \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{E}}^T{\mathbf{x}}_0 } \\ {{\mathbf{E}}^T{\mathbf{v}}_0 } \end{array} \right] = \mathbf{0}:\qquad \qquad {\mathbf{E}}^T{\mathbf{x}}_0 = \mathbf{0}\qquad \mathrm{and}\qquad {\mathbf{E}}^T\mathbf{v} = \mathbf{0}, \end{aligned} $$
    (154)
     
  2. (2)
    involving only initial coordinates and transformation parameters
    $$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{E}}^T} & {{\mathbf{J}}^T} \\ \mathbf{0} & {{\mathbf{J}}_t^T } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{x}}_0 } \\ \mathbf{v} \end{array} \right] {=} \left[ \begin{array}{c} {{\mathbf{E}}^T{\mathbf{x}}_0 + {\mathbf{J}}^T\mathbf{z}} \\ {{\mathbf{J}}_t^T \mathbf{z}} \end{array} \right] {=} \mathbf{0}:\qquad {\mathbf{E}}^T{\mathbf{x}}_0 + {\mathbf{J}}^T\mathbf{z} {=} \mathbf{0}\quad \mathrm{and}\quad {\mathbf{J}}_t^T \mathbf{z} \,{=}\, \mathbf{0}, \end{aligned} $$
    (155)
     
  3. (3)
    involving only velocities and transformation parameters
     
  4. (4)
    involving only transformation parameters
    $$\displaystyle \begin{aligned} \left[ \begin{array}{c} {{\mathbf{J}}^T\mathbf{z}} \\ {{\mathbf{J}}_t^T \mathbf{z}} \end{array} \right] = \mathbf{0}:\qquad \qquad \qquad \qquad {\mathbf{J}}^T\mathbf{z} = \mathbf{0}\qquad \quad \& \qquad \quad {\mathbf{J}}_t^T \mathbf{z} = \mathbf{0}. \end{aligned} $$
    (157)
     
The explicit forms of the above partial inner constraints can be easily formed from the explicit form of the total inner constraints Eqs. (148) and (149) by simply removing the terms corresponding to the non-participating parameters. For example, the partial inner constraints involving only initial coordinates and velocities are
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\sum_{i = 1}^n {[{\mathbf{x}}_{0i}^{ap} \times ]} {\mathbf{x}}_{0i} } \\ {\sum_{i = 1}^n {{\mathbf{x}}_{0i} } } \\ {\sum_{i = 1}^n {({\mathbf{x}}_{0i}^{ap} )^T} {\mathbf{x}}_{0i} } \end{array} \right] = \mathbf{0},\quad \left[ \begin{array}{c} {\sum_{i = 1}^n {[{\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{v}}_i } } \\ {\sum_{i = 1}^n {{\mathbf{v}}_i } } \\ {\sum_{i = 1}^n {({\mathbf{x}}_{0i}^{ap} )^T{\mathbf{v}}_i } } \end{array} \right] = \mathbf{0}, \end{aligned} $$
(158)
for initial epoch and rate, respectively.
In a similar way the partial inner constraints involving only transformation parameters are
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\sum_{k = 1}^m {\boldsymbol{\uptheta }_k } } \\ {\sum_{k = 1}^m {{\mathbf{d}}_k } } \\ {\sum_{k = 1}^m {s_k } } \end{array}\right] = \mathbf{0},\quad \left[ \begin{array}{c} {\sum_{k = 1}^m {(t_k - t_0 )\boldsymbol{\uptheta }_k } } \\ {\sum_{i = 1}^m {(t_k - t_0 ){\mathbf{d}}_k } } \\ {\sum_{k = 1}^m {(t_k - t_0 )s_k } } \end{array} \right] = \mathbf{0}, \end{aligned} $$
(159)
for initial epoch and rate, respectively.
Partial inner constraints involving only initial coordinates ETx0 = 0 or only velocities ETv = 0 do not qualify as minimal constraints. Indeed ETx0 = 0, defines the reference system at the initial epoch but not its temporal evolution, while ETv=0 defines the temporal evolution of the reference system but not at the initial epoch. This can be shown from the fact that the corresponding partial inner constraints matrices do not have full column rank in this case
$$\displaystyle \begin{aligned} \mathrm{rank}({\mathbf{E}}_{par} ) = \mathrm{rank}\left[ \begin{array}{cc} \mathbf{E} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] = 7 < 14,\qquad \mathrm{rank}({\mathbf{E}}_{par} ) = \mathrm{rank}\left[ \begin{array}{cc} \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{E} \\ \mathbf{0} & \mathbf{0} \end{array} \right] = 7 < 14. \end{aligned} $$
(160)
The possible generalized partial constraints for block diagonal weight matrix are the following:
  1. (1)
    involving only initial coordinates and velocities
    $$\displaystyle \begin{aligned} {\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } ({\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} ) = \mathbf{0}\qquad \mathrm{and}\qquad {\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} (\mathbf{v} - {\mathbf{v}}^{ref}) = \mathbf{0}, \end{aligned} $$
    (161)
     
  2. (2)
    involving only initial coordinates and transformation parameters
    $$\displaystyle \begin{aligned} {\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } ({\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} ) + {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}\qquad \mathrm{and}\qquad {\mathbf{J}}_t^T {\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}, \end{aligned} $$
    (162)
     
  3. (3)
    involving only velocities and transformation parameters
    $$\displaystyle \begin{aligned} {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}\qquad \mathrm{and}\qquad {\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} (\mathbf{v} - {\mathbf{v}}^{ref}) + {\mathbf{J}}^T_t{\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}, \end{aligned} $$
    (163)
     
  4. (4)
    involving only transformation parameters
    $$\displaystyle \begin{aligned} {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}\qquad \mathrm{and}\qquad {\mathbf{J}}_t^T {\mathbf{W}}_{\mathbf{z}} (\mathbf{z} - {\mathbf{z}}^{ref}) = \mathbf{0}. \end{aligned} $$
    (164)
    If the weight submatrices \({\mathbf {W}}_{{\mathbf {x}}_0 } \), Wv and Wz are also block-diagonal we may derive explicit forms of the above generalized constraints. For example, those involving only initial coordinates and velocities become
    $$\displaystyle \begin{aligned} \sum_{i = 1}^n {\left[ \begin{array}{c} { - [{\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{W}}_{{\mathbf{x}}_{0i} } ({\mathbf{x}}_{0i} - {\mathbf{x}}_{0i}^{ref} )} \\ {{\mathbf{W}}_{{\mathbf{x}}_{0i} } ({\mathbf{x}}_{0i} - {\mathbf{x}}_{0i}^{ref} )} \\ {({\mathbf{x}}_{0i}^{ap} )^T{\mathbf{W}}_{{\mathbf{x}}_{0i} } ({\mathbf{x}}_{0i} - {\mathbf{x}}_{0i}^{ref} )} \end{array} \right]} &= \mathbf{0},\\ \sum_{i = 1}^n {\left[ \begin{array}{c} { - [{\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{W}}_{{\mathbf{v}}_i } ({\mathbf{v}}_i - {\mathbf{v}}_i^{ref} )} \\ {{\mathbf{W}}_{{\mathbf{v}}_i } ({\mathbf{v}}_i - {\mathbf{v}}_i^{ref} )} \\ {({\mathbf{x}}_{0i}^{ap} )^T{\mathbf{W}}_{{\mathbf{v}}_i } ({\mathbf{v}}_i - {\mathbf{v}}_i^{ref} )} \end{array} \right]} &= \mathbf{0}, \end{aligned} $$
    (165)
    while those involving only transformation parameters become
    $$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\sum_{k = 1}^m {{\mathbf{W}}_{\boldsymbol{\uptheta }_k } (\boldsymbol{\uptheta }_k - \boldsymbol{\uptheta }_k^{ref} )} } \\ {\sum_{k = 1}^m {{\mathbf{W}}_{{\mathbf{d}}_k } ({\mathbf{d}}_k - {\mathbf{d}}_k^{ref} )} } \\ {\sum_{k = 1}^m {w_{s_t } (s_k - s_k^{ref} )} } \end{array} \right] = \mathbf{0},\quad \left[ \begin{array}{c} {\sum_{k = 1}^m {(t_k - t_0 ){\mathbf{W}}_{\boldsymbol{\uptheta }_k } (\boldsymbol{\uptheta }_k - \boldsymbol{\uptheta }_k^{ref} )} } \\ {\sum_{k = 1}^m {(t_k - t_0 ){\mathbf{W}}_{{\mathbf{d}}_k } ({\mathbf{d}}_k - {\mathbf{d}}_k^{ref} )} } \\ {\sum_{k = 1}^m {(t_k - t_0 )w_{s_t } (s_k - s_k^{ref} )} } \end{array} \right] = \mathbf{0}. \end{aligned} $$
    (166)
     

9 A Posteriori Change of the Spatiotemporal Reference System

Instead of applying directly a desired set of minimal constraints CTx = d, in order to obtain a corresponding least squares solution \(\hat {\mathbf {x}}_C \), one can obtain first any least squares solution \(\hat {\mathbf {x}}\) by means of any set of convenient minimal constraints and then transform it into the desired solution \(\hat {\mathbf {x}}_C \), using the solution conversion equation \(\hat {\mathbf {x}}_C = \hat {\mathbf {x}} - \mathbf {E}({\mathbf {C}}^T\mathbf {E})^{ - 1}({\mathbf {C}}^T\hat {\mathbf {x}} - \mathbf {d})\). We will apply this equation first to the various types of (total) inner constraints, usual and generalized ones, and then to the various types of partial inner constraints implementing only a subset of the unknown parameters. Finally, we look into the possibility to apply quite different types of constraints for initial coordinates and velocities, an option that becomes possible in view of the separation of the linearized reference system transformation into two independent ones for initial coordinates and velocities. An important result in this context is the possibility to apply the minimum norm or the minimum trace of the covariance matrix properties to only a subset of the above parameters.

9.1 Conversion to a Solution Satisfying Some Type of Inner Constraints

We will now examine how a solution obtained using any type of minimal constraints can be converted to a solution satisfying either generalized inner constraints or one of their special cases, such as the usual inner constraints. We refer here to total constraints involving all of the three sets of parameters, initial station coordinates, station velocities, and nuisance transformation parameters.

In the case of generalized inner constraints (weighted minimum distance constraints) \({\mathbf {E}}_{tot}^T \mathbf {W}({\mathbf {x}}_{tot} - {\mathbf {x}}^{ref}_{tot} ) = \mathbf {0}\), it holds that \({\mathbf {C}}^T = {\mathbf {E}}_{tot}^T \mathbf {W}\), \(\mathbf {d} = {\mathbf {E}}_{tot}^T \mathbf {Wx}^{ref}_{tot} \) and the general conversion equation \(\hat {\mathbf {x}}_C = \hat {\mathbf {x}} - \mathbf {E}({\mathbf {C}}^T\mathbf {E})^{ - 1}({\mathbf {C}}^T\hat {\mathbf {x}} - \mathbf {d})\) becomes
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{tot} = \hat{\mathbf{x}}_{tot} - {\mathbf{E}}_{tot} ({\mathbf{E}}_{tot}^T \mathbf{WE}_{tot} )^{ - 1}{\mathbf{E}}_{tot}^T \mathbf{W}(\hat{\mathbf{x}}_{tot} - {\mathbf{x}}^{ref}_{tot} ). \end{aligned} $$
(167)
Application to the explicit set of the stacking parameters and restriction to block-diagonal weight matrix gives
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\hat{\mathbf{{x}^{\prime}}}_0 } \\ {\hat{\mathbf{v}}^{\prime}} \\ {\hat{\mathbf{z}}^{\prime}} \end{array} \right] & = \left[ \begin{array}{c} {\hat{\mathbf{x}}_0 } \\ \hat{\mathbf{v}} \\ \hat{\mathbf{z}} \end{array} \right] - \left[ \begin{array}{cc} \mathbf{E} & \mathbf{0} \\ \mathbf{0} & \mathbf{E} \\ \mathbf{J} & {{\mathbf{J}}_t } \end{array} \right]\left[ \begin{array}{cc} {{\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } \mathbf{E} + {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} \mathbf{J}} & {{\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} {\mathbf{J}}_t } \\ {{\mathbf{J}}_t^T {\mathbf{W}}_{\mathbf{z}} \mathbf{J}} & {{\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} \mathbf{E} + {\mathbf{J}}_t^T {\mathbf{W}}_{\mathbf{z}} {\mathbf{J}}_t } \end{array} \right]^{ - 1}\times\\ &\qquad \qquad \times\left[ \begin{array}{cc} {{\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } (\hat{\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} ) + {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} (\hat{\mathbf{z}} - {\mathbf{z}}^{ref})} \\ {{\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} (\hat{\mathbf{v}} - {\mathbf{v}}^{ref}) + {\mathbf{J}}_t^T {\mathbf{W}}_{\mathbf{z}} (\hat{\mathbf{z}} - {\mathbf{z}}^{ref})} \end{array} \right] \equiv\\ & \equiv \left[ \begin{array}{c} {\hat{\mathbf{x}}_0 } \\ \hat{\mathbf{v}} \\ \hat{\mathbf{z}} \end{array} \right] - \left[ \begin{array}{cc} \mathbf{E} & \mathbf{0} \\ \mathbf{0} & \mathbf{E} \\ \mathbf{J} & {{\mathbf{J}}_t } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{t}}_{\mathbf{x}} } \\ {{\mathbf{t}}_{\mathbf{v}} } \end{array} \right]. \end{aligned} $$
(168)
Setting \({\mathbf {t}}_{\mathbf {x}} = \left [ {{\mathbf {t}}_{\mathbf {x},\boldsymbol {\uptheta }}^T \,{\mathbf {t}}_{\mathbf {x},\mathbf {d}}^T \,{t}_{\mathbf {x},s} } \right ]^T\), \({\mathbf {t}}_{\mathbf {v}} = \left [ {{\mathbf {t}}_{\mathbf {v},\boldsymbol {\uptheta }}^T \,{\mathbf {t}}_{\mathbf {v},\mathbf {d}}^T \,{t}_{\mathbf {v},s} } \right ]^T\) the conversion equations take the analytic form
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{0i} & = \hat{\mathbf{x}}_{0i} - \left[ {{\mathbf{x}}_{0i}^{ap} \times } \right]{\mathbf{t}}_{\mathbf{x},\boldsymbol{\uptheta }} - {\mathbf{t}}_{\mathbf{x},\mathbf{d}} - {t}_{\mathbf{x},s} {\mathbf{x}}_{0i}^{ap} , \\ {\hat{\mathbf{v}}^{\prime}}_i & = \hat{\mathbf{v}}_i - \left[ {{\mathbf{x}}_{0i}^{ap} \times } \right]{\mathbf{t}}_{\mathbf{v},\boldsymbol{\uptheta }} - {\mathbf{t}}_{\mathbf{v},\mathbf{d}} - {t}_{\mathbf{v},s} {\mathbf{x}}_{0i}^{ap} , \\ {\hat{\boldsymbol{\theta }}^{\prime}}_k & = \hat{\boldsymbol{\theta }}_k + {\mathbf{t}}_{\mathbf{x},\boldsymbol{\uptheta }} + (t_k - t_0 ){\mathbf{t}}_{\mathbf{v},\boldsymbol{\uptheta }} , \\ {\hat{\mathbf{d}}^{\prime}}_k & = \hat{\mathbf{d}}_k + {\mathbf{t}}_{\mathbf{x},\mathbf{d}} + (t_k - t_0 ){\mathbf{t}}_{\mathbf{v},\mathbf{d}} , \\ {\hat{s}}^{\prime}_k & = \hat{s}_k + t_{\mathbf{x},s} + (t_k - t_0)t_{\mathbf{v},s}. {} \end{aligned} $$
(169)
The above equations are general enough to cover all types of constraints. What varies from case to case is the form of the auxiliary vectors tx and tv. We give below the possible special cases.

Conversion to the solution for minimum distance constrains (special case W = I):

The minimal constraints have in this case the form \({\mathbf {E}}_{tot}^T ({\mathbf {x}}_{tot} - {\mathbf {x}}^{ref}_{tot} ) = \mathbf {0}\) and the auxiliary conversion vectors become
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {{\mathbf{t}}_{\mathbf{x}} } \\ {{\mathbf{t}}_{\mathbf{v}} } \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{E}}^T\mathbf{E} + n{\mathbf{I}}_7 } & {\tau_1 {\mathbf{I}}_7 } \\ {\tau_1 {\mathbf{I}}_7 } & {{\mathbf{E}}^T\mathbf{E} + \tau_2 {\mathbf{I}}_7 } \end{array} \right]^{ - 1}\left[ \begin{array}{cc} {{\mathbf{E}}^T(\hat{\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} ) + {\mathbf{J}}^T(\hat{\mathbf{z}} - {\mathbf{z}}^{ref})} \\ {{\mathbf{E}}^T(\hat{\mathbf{v}} - {\mathbf{v}}^{ref}) + {\mathbf{J}}_t^T (\hat{\mathbf{z}} - {\mathbf{z}}^{ref})} \end{array} \right], \end{aligned} $$
(170)
where
$$\displaystyle \begin{aligned} \tau_1 = \sum_{k = 1}^s {(t_k - t_0 )} ,\qquad \quad \tau_2 = \sum_{k = 1}^s {(t_k - t_0 )^2} . \end{aligned} $$
(171)
Conversion to the solution for inner constrains (special case W = I, \({\mathbf {x}}^{ref}_{tot} = \mathbf {0})\):
The minimal constraints have in this case the form \({\mathbf {E}}_{tot}^T \,{\mathbf {x}}_{tot} = \mathbf {0}\) and the auxiliary conversion vectors become
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {{\mathbf{t}}_{\mathbf{x}} } \\ {{\mathbf{t}}_{\mathbf{v}} } \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{E}}^T\mathbf{E} + n{\mathbf{I}}_7 } & {\tau_1 {\mathbf{I}}_7 } \\ {\tau_1 {\mathbf{I}}_7 } & {{\mathbf{E}}^T\mathbf{E} + \tau_2 {\mathbf{I}}_7 } \end{array} \right]^{ - 1}\left[ \begin{array}{c} {{\mathbf{E}}^T\hat{\mathbf{x}}_0 + {\mathbf{J}}^T\hat{\mathbf{z}}} \\ {{\mathbf{E}}^T\hat{\mathbf{v}} + {\mathbf{J}}_t^T \hat{\mathbf{z}}} \end{array} \right]. \end{aligned} $$
(172)

9.2 Conversion to a Solution Satisfying Some Type of Partial Inner Constraints

We look next into the conversion into a solution satisfying generalized partial inner constraints of the form \({\mathbf {E}}_{par}^T \mathbf {W}({\mathbf {x}}_{tot} - {\mathbf {x}}^{ref}_{tot} ) = \mathbf {0}\), where only a subset of the unknowns is constrained, and their special cases. In this case the general conversion relation \({\hat {\mathbf {x}}^{\prime }_{tot}} = \hat {\mathbf {x}}_{tot} - \mathbf {E}({\mathbf {C}}^T\,\mathbf {E})^{ - 1}({\mathbf {C}}^T{\hat {\mathbf {x}}_{tot}} - \mathbf {d})\) takes with \({\mathbf {C}}^T = {\mathbf {E}}_{par}^T \mathbf {W}\) and \(\mathbf {d} = {\mathbf {E}}_{par}^T \mathbf {Wx}^{ref} \) the form
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{tot} = \hat{\mathbf{x}}_{tot} - \mathbf{E}({\mathbf{E}}_{par}^T \mathbf{WE}_{tot} )^{ - 1}{\mathbf{E}}_{par}^T \mathbf{W}(\hat{\mathbf{x}}_{tot} - {\mathbf{x}}^{ref}_{tot} ) = \hat{\mathbf{x}}_{tot} - \mathbf{Et}. \end{aligned} $$
(173)
Application to the explicit set of the stacking parameters and restriction to block-diagonal weight matrix gives again the general conversion Eqs. (169) with
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {{\mathbf{t}}_{\mathbf{x}} } \\ {{\mathbf{t}}_{\mathbf{v}} } \end{array} \right] = \left( {{\mathbf{E}}_{par}^T \left[ \begin{array}{c@{\,\,\,\,\,\,}c} {{\mathbf{W}}_{{\mathbf{x}}_0 } \mathbf{E}} & \mathbf{0} \\ \mathbf{0} & {{\mathbf{W}}_{\mathbf{v}} \mathbf{E}} \\ {{\mathbf{W}}_{\mathbf{z}} \mathbf{J}} & {{\mathbf{W}}_{\mathbf{z}} {\mathbf{J}}_t } \end{array} \right]} \right)^{ - 1}{{\mathbf{E}}_{par}^T \left[ \begin{array}{c} {{\mathbf{W}}_{{\mathbf{x}}_0 } (\hat{\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} )} \\ {{\mathbf{W}}_{\mathbf{v}} (\hat{\mathbf{v}} - {\mathbf{v}}^{ref})} \\ {{\mathbf{W}}_{\mathbf{z}} (\hat{\mathbf{z}} - {\mathbf{z}}^{ref})} \end{array} \right]}. \end{aligned} $$
(174)
To get the conversion equation for each particular type of partial inner constraints we merely need to replace the Epar matrix with its particular form. We will present only two cases, the one involving only initial coordinates and velocities and the one involving only transformation parameters.

Generalized partial inner constraints involving only initial coordinates and velocities:

In this case
$$\displaystyle \begin{aligned} {\mathbf{E}}_{par} = \left[ \begin{array}{cc} \mathbf{E} & \mathbf{0} \\ \mathbf{0} & \mathbf{E} \\ \mathbf{0} & \mathbf{0} \end{array} \right], \end{aligned} $$
(175)
and one needs to apply Eqs. (169) with
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {{\mathbf{t}}_{\mathbf{x}} } \\ {{\mathbf{t}}_{\mathbf{v}} } \end{array} \right] = \left[ \begin{array}{c} {({\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } \mathbf{E})^{ - 1}{\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } (\hat{\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} )} \\ {({\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} \mathbf{E})^{ - 1}{\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} (\hat{\mathbf{v}} - {\mathbf{v}}^{ref})} \end{array} \right]. \end{aligned} $$
(176)
For the special case \({\mathbf {W}}_{{\mathbf {x}}_0 } = \mathbf {I}\), Wv = I, it holds that \({\mathbf {t}}_{\mathbf {x}} = ({\mathbf {E}}^T\mathbf {E})^{ - 1}{\mathbf {E}}^T(\hat {\mathbf {x}}_0 - {\mathbf {x}}_0^{ref} )\), \({\mathbf {t}}_{\mathbf {v}} = ({\mathbf {E}}^T\mathbf {E})^{ - 1}{\mathbf {E}}^T(\hat {\mathbf {v}} - {\mathbf {v}}^{ref})\) and
$$\displaystyle \begin{aligned} ({\mathbf{E}}^T\mathbf{E}) = \left[ \begin{array}{ccc} {\mathbf{C} - n\left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]^2} & { - n\left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]} & \mathbf{0} \\ {n\left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]} & {n{\mathbf{I}}_3 } & {n\bar{\mathbf{x}}_0^{ap} } \\ \mathbf{0} & {n(\bar{\mathbf{x}}_0^{ap} )^T} & {\gamma^2 + n(\bar{\mathbf{x}}_0^{ap} )^T\bar{\mathbf{x}}_0^{ap} } \end{array} \right], \end{aligned} $$
(177)
where we have set
$$\displaystyle \begin{aligned} &{} \bar{\mathbf{x}}_0^{ap} = \frac{1}{n}\sum_{i = 1}^n {{\mathbf{x}}_{0\,i}^{ap}} ,\qquad \Delta {\mathbf{x}}_{0\,i}^{ap} = {\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap}, \end{aligned} $$
(178)
$$\displaystyle \begin{aligned} &{} \mathbf{C} = - \sum_{i = 1}^n {[\Delta {\mathbf{x}}_{0\,i}^{ap} \times ]^2,} \end{aligned} $$
(179)
$$\displaystyle \begin{aligned} &{} \gamma^2 = \sum_{i = 1}^n {(\Delta {\mathbf{x}}_{0\,i}^{ap} )^T\Delta {\mathbf{x}}_{0\,i}^{ap}}. \end{aligned} $$
(180)
Implementation of the analytical inverse
$$\displaystyle \begin{aligned} ({\mathbf{E}}^T\mathbf{E})^{ - 1} = \left[ \begin{array}{ccc} {\mathbf{C}}^{-1} & {{\mathbf{C}}^{-1}\left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]} & \mathbf{0} \\ {} - \left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]{\mathbf{C}}^{ - 1} & {\frac{1}{n}{\mathbf{I}}_3 + \frac{1}{\gamma^2}\bar{\mathbf{x}}_0^{ap} (\bar{\mathbf{x}}_0^{ap} )^T - [\bar{\mathbf{x}}_0^{ap} \times ]{\mathbf{C}}^{ - 1}[\bar{\mathbf{x}}_0^{ap} \times ]} & { - \frac{1}{\gamma^2}\bar{\mathbf{x}}_0^{ap} } \\ {} \mathbf{0} & { - \frac{1}{\gamma^2}(\bar{\mathbf{x}}_0^{ap} )^T} & {\frac{1}{\gamma^2}} \end{array} \right], \end{aligned} $$
(181)
leads to the following conversion equations to a solution satisfying the constraints \(\mathbf {EW}_{{\mathbf {x}}_0 } (\hat {\mathbf {x}}_0 - {\mathbf {x}}_0^{ref} ) = \mathbf {0}\) and \(\mathbf {EW}_{\mathbf {v}} (\hat {\mathbf {v}} - {\mathbf {v}}^{ref}) = \mathbf {0}\), simultaneously
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{0i} & = {\hat{\mathbf{x}}}_{0i} - \frac{1}{n}\sum_{j = 1}^n (\hat{\mathbf{x}}_{0j}-{\mathbf{x}}^{ref}_{0j}){ + \left[ {\Delta {\mathbf{x}}_{0i}^{ap} \times } \right]} {\mathbf{C}}^{ - 1}\sum_{j = 1}^n {\left[ {\Delta {\mathbf{x}}_{0j}^{ap} \times } \right]} \,(\hat{\mathbf{x}}_{0j}-{\mathbf{x}}^{ref}_{0j})-\\ &\quad - \frac{1}{\gamma^2}\left[ {\sum_{j = 1}^n {(\Delta {\mathbf{x}}_{0j}^{ap} )^T} (\hat{\mathbf{x}}_{0j}-{\mathbf{x}}^{ref}_{0j})} \right]\Delta {\mathbf{x}}_{0i}^{ap} , \\ {\hat{\mathbf{v}}^{\prime}}_i & = {\hat{\mathbf{v}}}_i-\frac{1}{n}\sum_{j=1}^{n}(\hat{\mathbf{v}}_j-{\mathbf{v}}_{j}^{ref}) + \left[ {\Delta {\mathbf{x}}_{0i}^{ap} \times } \right]{\mathbf{C}}^{ - 1}\sum_{j = 1}^n {\left[ {\Delta {\mathbf{x}}_{0j}^{ap} \times } \right]} (\hat{\mathbf{v}}_j -{\mathbf{v}}_j^{ref})- \\ &\quad - \frac{1}{\gamma^2}\left[ \sum_{j = 1}^n ( {\Delta {\mathbf{x}}_{0j}^{ap}} )^T(\hat{\mathbf{v}}_j -{\mathbf{v}}_j^{ref}) \right]\Delta {\mathbf{x}}_{0i}^{ap} , \\ {\hat{\boldsymbol{\uptheta }}^{\prime}}_k & = \hat{\boldsymbol{\uptheta }}_k + {\mathbf{C}}^{ - 1}\sum_{j = 1}^n [ {\Delta {\mathbf{x}}_{0j}^{ap} \times } ]\left[\hat{\mathbf{x}}_{0j} - {\mathbf{x}}_{0j}^{ref}+(t_k - t_0 )(\hat{\mathbf{v}}_j -{\mathbf{v}}_j^{ref})\right], \\ {\hat{s}}^{\prime}_k & = \hat{s}_k - \frac{1}{\gamma^2}\sum_{j = 1}^n \left( {\Delta {\mathbf{x}}_{0j}^{ap}} \right)^T\left[\hat{\mathbf{x}}_{0j} -{\mathbf{x}}_{0j}^{ref}+ (t_k - t_0 )(\hat{\mathbf{v}}_j-{\mathbf{v}}_{j}^{ref}) \right], \\ {\hat{\mathbf{d}}^{\prime}}_k & = \hat{\mathbf{d}}_k - \frac{1}{n}\sum_{j = 1}^n {\left[\hat{\mathbf{x}}_{0j} -{\mathbf{x}}_{0j}^{ref}+ (t_k - t_0 )(\hat{\mathbf{v}}_j -{\mathbf{v}}_{j}^{ref})\right]-}\\ &\quad -\left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]({\hat{\boldsymbol{\uptheta }}^{\prime}}_k - \hat{\boldsymbol{\uptheta }}_k ) - ({\hat{s}}^{\prime}_k - \hat{s}_k )\bar{\mathbf{x}}_0^{ap} .{}\end{aligned} $$
(182)
Generalized partial inner constraints involving only transformation parameters:
In this case
$$\displaystyle \begin{aligned} {\mathbf{E}}_{par} = \left[ \begin{array}{cc} \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \\ \mathbf{J} & {{\mathbf{J}}_t } \end{array} \right],\end{aligned} $$
(183)
and one may apply Eqs. (169) with
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\mathbf{t}}_{\mathbf{x}} \\ {\mathbf{t}}_{\mathbf{v}} \end{array} \right] = \left[ \begin{array}{c@{\,\,\,\,\,\,}c} {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} \mathbf{J} & {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} {\mathbf{J}}_t \\ {\mathbf{J}}^T_t{\mathbf{W}}_{\mathbf{z}} \mathbf{J} & {\mathbf{J}}^T_t{\mathbf{W}}_{\mathbf{z}} {\mathbf{J}}_t \end{array} \right]^{-1}\left[ \begin{array}{c} {\mathbf{J}}^T{\mathbf{W}}_{\mathbf{z}} (\hat{\mathbf{z}} - {\mathbf{z}}^{ref}) \\ {\mathbf{J}}_t^T {\mathbf{W}}_{\mathbf{z}} (\hat{\mathbf{z}} - {\mathbf{z}}^{ref}) \end{array} \right]. \end{aligned} $$
(184)
In the special case that Wz = I we may analytically invert
$$\displaystyle \begin{aligned} \left[ \begin{array}{c@{\,\,\,\,}c} {\mathbf{J}}^T\mathbf{J} & {\mathbf{J}}^T{\mathbf{J}}_t \\ {\mathbf{J}}_t^T \mathbf{J} & {\mathbf{J}}_t^T {\mathbf{J}}_t \end{array} \right]^{ - 1} = \left[ \begin{array}{cc} n{\mathbf{I}}_7 & \tau_1 {\mathbf{I}}_7 \\ \tau_1 {\mathbf{I}}_7 & \tau_2 {\mathbf{I}}_7 \end{array} \right]^{ - 1} = \frac{1}{n\tau_2 - \tau_1^2 }\left[ \begin{array}{cc} \tau_2 {\mathbf{I}}_7 & \tau_1 {\mathbf{I}}_7 \\ - \tau_1 {\mathbf{I}}_7 & n{\mathbf{I}}_7 \end{array} \right], \end{aligned} $$
(185)
and obtain
$$\displaystyle \begin{aligned} {\mathbf{t}}_{x} & = \frac{\tau_2 }{n\tau_2 - \tau_1^2 }\sum_k {(\hat{\mathbf{z}}_k - {\mathbf{z}}_k^{ref} ) - \frac{\tau_1 }{n\tau_2 - \tau_1^2 }} \sum_k {(t_k - t_0 )(\hat{\mathbf{z}}_k - {\mathbf{z}}_k^{ref} ),} \end{aligned} $$
(186)
$$\displaystyle \begin{aligned} {\mathbf{t}}_{{v}} & = -\frac{\tau_1 }{n\tau_1 - \tau_1^2 }\sum_k {(\hat{\mathbf{z}}_k - {\mathbf{z}}_k^{ref} ) + \frac{n}{n\tau_2 - \tau_1^2 }} \sum_k {(t_k - t_0 )(\hat{\mathbf{z}}_k - {\mathbf{z}}_k^{ref} ).} \end{aligned} $$
(187)
It is also possible to derive similar relations for generalized partial inner constraints involving only initial coordinates and transformation parameters or only velocities and transformation parameters. It is not though possible to apply generalized partial inner constraints involving only initial coordinates or only velocities as already explained.

9.3 Conversion to a Solution Satisfying Different Constraints for Initial Coordinates and Velocities

An important characteristic of the linearized parameter transformation is the splitting (Eq. 147) between the initial coordinates transformation x0 = x0 + Ep0 and the velocity transformation \(\mathbf {{v}^{\prime }} = \mathbf {v} + \mathbf {E}\dot {\mathbf {p}}\). This allows us to seek solutions which incorporate different principles for selecting the least squares solutions \(\hat {\mathbf {x}}_0 \) and \(\hat {\mathbf {v}}\), beyond the above cases based on the choice of a set of “global” minimal constraints \({\mathbf {C}}_{tot}^{T} {\mathbf {x}}_{tot} = {\mathbf {d}}_{tot}\) involving both x0 and v. Before examining particular choices, we consider the general case where the constraints matrix has the form
$$\displaystyle \begin{aligned} {\mathbf{C}}_{tot} = \left[ \begin{array}{cc} {\mathbf{C}}_x & \mathbf{0} \\ \mathbf{0} & {\mathbf{C}}_v \\ \mathbf{0} & \mathbf{0} \end{array} \right], \end{aligned} $$
(188)
and leads to separate constraints \({\mathbf {C}}_x^T {\mathbf {x}}_0 = {\mathbf {d}}_x \) and \({\mathbf {C}}_v^T \mathbf {v} = {\mathbf {d}}_v\), which can be thus of a different nature. The total conversion \({\hat {\mathbf {x}}^{\prime }}_{tot} = \hat {\mathbf {x}}_{tot} - {\mathbf {E}}_{tot} ({\mathbf {C}}^T_{tot} {\mathbf {E}}_{tot} )^{ - 1}({\mathbf {C}}_{tot}^T \hat {\mathbf {x}}_{tot} - {\mathbf {d}}_{tot} )\) from an original least squares solution to the one satisfying \({\mathbf {C}}_{tot}^T {\mathbf {x}}_{tot} = {\mathbf {d}}_{tot}\), takes in this case the form
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_0 & {=} \hat{\mathbf{x}}_0 - \mathbf{E}({\mathbf{C}}_x^T \mathbf{E})^{ - 1}({\mathbf{C}}_x^T \hat{\mathbf{x}}_0 - {\mathbf{d}}_x ) {=} \hat{\mathbf{x}}_0 - \mathbf{Et}_x ,\quad {\mathbf{t}}_x {=} ({\mathbf{C}}_x^T \mathbf{E})^{ - 1}({\mathbf{C}}_x^T \hat{\mathbf{x}}_0 - {\mathbf{d}}_x ), \end{aligned} $$
(189)
$$\displaystyle \begin{aligned} {\hat{\mathbf{v}}^{\prime}} & = \hat{\mathbf{v}} - \mathbf{E}({\mathbf{C}}_v^T \mathbf{E})^{ - 1}({\mathbf{C}}_v^T \hat{\mathbf{v}} - {\mathbf{d}}_v ) = \hat{\mathbf{v}} - \mathbf{Et}_{v} ,\qquad {\mathbf{t}}_{v} = ({\mathbf{C}}_v^T \mathbf{E})^{ - 1}({\mathbf{C}}_v^T \hat{\mathbf{v}} - {\mathbf{d}}_v ), \end{aligned} $$
(190)
$$\displaystyle \begin{aligned} {\hat{\mathbf{z}}^{\prime}} & = \hat{\mathbf{z}} - \mathbf{Jt}_x - {\mathbf{J}}_t {\mathbf{t}}_v . \end{aligned} $$
(191)
Note that we arrive at the same solution regardless of whether the constraints are applied jointly or separately. Seeking \(\hat {\mathbf {{x}^{\prime }}}_0 = \hat {\mathbf {x}}_0 + \mathbf {Ep}_0 \) satisfying \({\mathbf {C}}_x^T {\mathbf {x}}_0 = {\mathbf {d}}_x \) alone, leads to the solution (189) and seeking \(\hat {\mathbf {{v}^{\prime }}} = \hat {\mathbf {v}} + \mathbf {E}\dot {\mathbf {p}}\) satisfying \({\mathbf {C}}_v^T {\mathbf {x}}_0 = {\mathbf {d}}_v \) alone, leads to the solution (190).
We look first into the case of separate generalized inner constraints. In the case of minimal weighted distance, if we want to minimize \(\phi _x = \vert \vert {\hat {\mathbf {x}}^{\prime }}_0 - {\mathbf {x}}_0^{ref} \vert \vert _{{\mathbf {W}}_{{\mathbf {x}}_0 } }^2 \) alone, then replacing \(\hat {\mathbf {{x}^{\prime }}}_0 = \hat {\mathbf {x}}_0 + \mathbf {Ep}_0 \) we find that the minimum of
$$\displaystyle \begin{aligned} \phi_x = ({\hat{\mathbf{x}}^{\prime}}_0 - {\mathbf{x}}_0^{ref} )^T{\mathbf{W}}_{{\mathbf{x}}_0 } ({\hat{\mathbf{x}}^{\prime}}_0 - {\mathbf{x}}_0^{ref} ) = (\hat{\mathbf{x}}_0 + \mathbf{Ep}_0 - {\mathbf{x}}_0^{ref} )^T{\mathbf{W}}_{{\mathbf{x}}_0 } (\hat{\mathbf{x}}_0 + \mathbf{Ep}_0 - {\mathbf{x}}_0^{ref} ) = \mathop{\min}\limits_{{\mathbf{p}}_0 } , \end{aligned} $$
(192)
is provided by \(\frac {\partial \phi _x }{\partial {\mathbf {p}}_0 } = (\hat {\mathbf {x}}_0 + \mathbf {Ep}_0 - {\mathbf {x}}_0^{ref} )^T{\mathbf {W}}_{{\mathbf {x}}_0 } \mathbf {E} = \mathbf {0}\), which gives \({\mathbf {p}}_0 = - ({\mathbf {E}}^T{\mathbf {W}}_{{\mathbf {x}}_0 } \mathbf {E})^{ - 1}{\mathbf {E}}^T{\mathbf {W}}_{{\mathbf {x}}_0} (\hat {\mathbf {x}}_0 - {\mathbf {x}}_0^{ref} )\) and hence
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_0 = \hat{\mathbf{x}}_0 - \mathbf{E}({\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } \mathbf{E})^{ - 1}{\mathbf{E}}^T{\mathbf{W}}_{{\mathbf{x}}_0 } (\hat{\mathbf{x}}_0 - {\mathbf{x}}_0^{ref} ). \end{aligned} $$
(193)
If we want to minimize \(\phi _x = \vert \vert {\hat {\mathbf {v}}^{\prime }} - {\mathbf {v}}^{ref}\vert \vert _{{\mathbf {W}}_{\mathbf {v}} }^2 \) alone, then replacing \({\hat {\mathbf {v}}^{\prime }} = \hat {\mathbf {v}} + \mathbf {E}\dot {\mathbf {p}}\) we find that the minimum of
$$\displaystyle \begin{aligned} \phi_v = ({\hat{\mathbf{v}}^{\prime}} - {\mathbf{v}}^{ref})^T{\mathbf{W}}_{\mathbf{v}} (\mathbf{{v}^{\prime}} - {\mathbf{v}}^{ref}) = ({\hat{\mathbf{v}}^{\prime}} + \mathbf{E}\dot{\mathbf{p}} - {\mathbf{v}}^{ref})^T{\mathbf{W}}_{\mathbf{v}} (\hat{\mathbf{{v}^{\prime}}} + \mathbf{E}\dot{\mathbf{p}} - {\mathbf{v}}^{ref}) = \mathop{\min }\limits_{\dot{\mathbf{p}}} , \end{aligned} $$
(194)
is provided by \(\frac {\partial \phi _v }{\partial \dot {\mathbf {p}}} =2 ({\hat {\mathbf {v}}^{\prime }} + \mathbf {E}\dot {\mathbf {p}} - {\mathbf {v}}^{ref})^T{\mathbf {W}}_{\mathbf {v}} \mathbf {E} = \mathbf {0}\), which gives \(\dot {\mathbf {p}}_0 =\)\(- ({\mathbf {E}}^T{\mathbf {W}}_{\mathbf {v}} \mathbf {E})^{ - 1}{\mathbf {E}}^T{\mathbf {W}}_{\mathbf {v}} (\hat {\mathbf {v}} - {\mathbf {v}}^{ref} )\) and hence
$$\displaystyle \begin{aligned} {\hat{\mathbf{v}}^{\prime}} = \hat{\mathbf{v}} - \mathbf{E}({\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} \mathbf{E})^{ - 1}{\mathbf{E}}^T{\mathbf{W}}_{\mathbf{v}} (\hat{\mathbf{v}} - {\mathbf{v}}^{ref}). \end{aligned} $$
(195)
Comparing with the joint generalized inner constraints solution where ϕx + ϕv = \(\vert \vert \mathbf {{x}^{\prime }}_0 - {\mathbf {x}}_0^{ref} \vert \vert _{{\mathbf {W}}_{{\mathbf {x}}_0 } }^2 + \vert \vert {\hat {\mathbf {v}}^{\prime }} - {\mathbf {v}}^{ref}\vert \vert _{{\mathbf {W}}_{\mathbf {v}} }^2 \) is minimized we see that the solutions are the same. Therefore when \(\vert \vert \mathbf {{x}^{\prime }}_0 - {\mathbf {x}}_0^{ref} \vert \vert _{{\mathbf {W}}_{{\mathbf {x}}_0 } }^2 + \vert \vert {\hat {\mathbf {v}}^{\prime }} - {\mathbf {v}}^{ref}\vert \vert _{{\mathbf {W}}_{\mathbf {v}} }^2 \) is minimized, also \(\vert \vert \mathbf {{x}^{\prime }}_0 - {\mathbf {x}}_0^{ref} \vert \vert _{{\mathbf {W}}_{{\mathbf {x}}_0 } }^2 \) and \(\vert \vert {\hat {\mathbf {v}}^{\prime }} - {\mathbf {v}}^{ref}\vert \vert _{{\mathbf {W}}_{\mathbf {v}} }^2 \) are simultaneously minimized! In the special case of the classical inner constraints (\({\mathbf {x}}_0^{ref} = \mathbf {0}\), vref = 0, \({\mathbf {W}}_{{\mathbf {x}}_0 } = \mathbf {I})\), the solution where \(\vert \vert {\mathbf {{x}^{\prime }}}_0 - {\mathbf {x}}_0^{ref} \vert \vert ^2 + \vert \vert {\hat {\mathbf {v}}^{\prime }} - {\mathbf {v}}^{ref}\vert \vert ^2\) is minimized, also minimizes \(\vert \vert {\mathbf {{x}^{\prime }}}_0 - {\mathbf {x}}_0^{ref} \vert \vert ^2\) and ||vvref||2, separately.

The importance of the above conclusions lies in the fact that we do not need to use a minimum norm or a minimum weighted distances principle for both initial coordinates and velocities. We can combine such a principle for one of the two, e.g., \(\vert \vert \hat {\mathbf {{x}^{\prime }}}_0 - {\mathbf {x}}_0^{ref} \vert \vert _{{\mathbf {W}}_{{\mathbf {x}}_0 } }^2 = \min \) for x0 via the constraints \({\mathbf {E}}^T{\mathbf {W}}_{{\mathbf {x}}_0 } ({\mathbf {x}}_0 - {\mathbf {x}}_0^{ref} ) = \mathbf {0}\), with any other set of constraints whatsoever for the other, e.g., \({\mathbf {C}}_v^T \mathbf {v} = {\mathbf {d}}_v\). This will prove to be an important property, because \(\vert \vert \hat {\mathbf {{x}^{\prime }}}_0 - {\mathbf {x}}_0^{ref} \vert \vert _{{\mathbf {W}}_{{\mathbf {x}}_0 } }^2 = \min \) is quite useful in bringing the solution \({\hat {\mathbf {x}}^{\prime }}_0 \) close to a preexisting one \({\mathbf {x}}_0^{ref}\), while for velocities a different optimal choice is preferable, as we will see in the next section.

By the way, it is important to keep in mind that initial velocity constraints \({\mathbf {C}}_x^T {\mathbf {x}}_0 = {\mathbf {d}}_x \) should always be combined with a set of velocity constraints \({\mathbf {C}}_v^T {\mathbf {x}}_0 =\)dv, and vice versa, because none of the two can stand alone as a set of partial minimal constraints. We know that the simultaneous use of partial inner constraints ETx0 = dx and ETv = dv where in addition to the minimum property the trace of the joint covariance cofactor matrix of \(\hat {\mathbf {x}}_0\) and \(\hat {\mathbf {v}}\) is minimized, i.e., \(tr{\mathbf {Q}}_{\hat {\mathbf {x}}_0 } + tr{\mathbf {Q}}_{\hat {\mathbf {v}}} =\)\(\min \). An interesting open question is whether this choice minimizes also \(tr{\mathbf {Q}}_{\hat {\mathbf {x}}_0 } \) and \(tr{\mathbf {Q}}_{\hat {\mathbf {v}}} \) separately, as it does in the case of the norms. The answer is positive. To demonstrate this, consider the constraints \({\mathbf {C}}_x^T {\mathbf {x}}_0 = {\mathbf {d}}_x \) which provide the solution \({\hat {\mathbf {x}}^{\prime }}_0 = [\mathbf {I} - \mathbf {E}({\mathbf {C}}_x^T \mathbf {E})^{ - 1}{\mathbf {C}}_x^T ]\,\hat {\mathbf {x}}_0 + \mathbf {E}({\mathbf {C}}_x^T \mathbf {E})^{ - 1}{\mathbf {d}}_x \), with covariance cofactor matrix
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{{\hat{\mathbf{x}}^{\prime}}_0 } = [\mathbf{I} - \mathbf{E}({\mathbf{C}}_x^T \mathbf{E})^{ - 1}{\mathbf{C}}_x^T ]{\mathbf{Q}}_{\hat{\mathbf{x}}_0 } [\mathbf{I} - \mathbf{E}({\mathbf{C}}_x^T \mathbf{E})^{ - 1}{\mathbf{C}}_x^T ]^T = \mathbf{HQ}_0 {\mathbf{H}}^T. \end{aligned} $$
(196)
Setting \({\mathbf {Q}}_{{\hat {\mathbf {x}}}_0 } = {\mathbf {Q}}_0\) and Cx = C for simplicity we have H = I −E(CTE)−1CT and we seek to find the matrix C, which minimizes
$$\displaystyle \begin{aligned} \phi = tr{\mathbf{Q}}_{{\hat{\mathbf{x}}^{\prime}}_0 } = tr(\mathbf{HQ}_0 {\mathbf{H}}^T). \end{aligned} $$
(197)
Recalling that \(\frac {\partial \mathbf {C}}{\partial C_{ik}} = {\mathbf {e}}_i {\mathbf {e}}_k^T\), \(\frac {\partial {\mathbf {C}}^T}{\partial C_{ik}} = {\mathbf {e}}_k {\mathbf {e}}_i^T\), where ei stands for the ith column of the 3n × 3n identity matrix, and the property \(\frac {\partial ({\mathbf {M}}^{ - 1})}{\partial q} = - {\mathbf {M}}^{ - 1}\frac {\partial \mathbf {M}}{\partial q}{\mathbf {M}}^{ - 1}\) we can easily compute \(\frac {\partial \mathbf {H}}{\partial C_{ik} } = - \mathbf {E}({\mathbf {C}}^T\mathbf {E})^{ - 1}{\mathbf {e}}_k {\mathbf {e}}_i^T \mathbf {H}\), so that the
$$\displaystyle \begin{aligned} \frac{\partial \phi }{\partial C_{ik}} & = \mathrm{tr}\left\{ {\frac{\partial \mathbf{H}}{\partial C_{ik} }{\mathbf{Q}}_{0} {\mathbf{H}}^T} \right\} + \mathrm{tr}\left\{ {\mathbf{HQ}_{0} \frac{\partial {\mathbf{H}}^T}{\partial C_{ik} }} \right\} = 2\,\mathrm{tr}\left\{ {\frac{\partial \mathbf{H}}{\partial C_{ik} }{\mathbf{Q}}_{0} {\mathbf{H}}^T} \right\} = \\ & = 2\,\mathrm{tr}\left\{ { - \mathbf{E}({\mathbf{C}}^T\mathbf{E})^{-1}{\mathbf{e}}_k}{\mathbf{e}}_i^{T}\mathbf{HQ}_{0}{\mathbf{H}}^{T} \right\} = - 2{\mathbf{e}}_i^T \mathbf{HQ}_0 {\mathbf{H}}^T\mathbf{E}({\mathbf{C}}^T\mathbf{E})^{- 1}{\mathbf{e}}_k=\\ &= - 2[\mathbf{HQ}_0 {\mathbf{H}}^T\mathbf{E}({\mathbf{C}}^T\mathbf{E})^{-1}]_{ik}. \end{aligned} $$
(198)
The minimum is obtained for \(\frac {\partial \phi }{\partial C_{ik} } = 0\) and therefore from the solution of the matrix equation
$$\displaystyle \begin{aligned} \mathbf{F}(\mathbf{C}) = [\mathbf{H}(\mathbf{C})] {\mathbf{Q}}_0 [\mathbf{H}(\mathbf{C})]^T \mathbf{E} = \mathbf{0}. \end{aligned} $$
(199)

Since obviously [H(E)]TE = 0, it follows that the choice Cx = C = E satisfies the above equation and thus the constraints ETx0 = dx minimize \(tr{\mathbf {Q}}_{{\hat {\mathbf {x}}^{\prime }}_0}\), independently of the constraints \({\mathbf {C}}_v^T \mathbf {v} = {\mathbf {d}}_v\) used for the velocities. In a completely analogous way the constraints ETv = dv minimize \(tr{\mathbf {Q}}_{\hat {\mathbf {v}}^{\prime }}\), independently of the constraints \({\mathbf {C}}_x^T {\mathbf {x}}_0 = {\mathbf {d}}_x\) used for the initial coordinates. In particular the joint use of ETx0 = dx and ETv = dv does not only minimize \(tr{\mathbf {Q}}_{{\hat {\mathbf {x}}^{\prime }}_0 } + tr{\mathbf {Q}}_{\hat {\mathbf {v}}^{\prime }}\), as already known but also \(tr{\mathbf {Q}}_{{\hat {\mathbf {x}}^{\prime }}_0 }\) and \(tr{\mathbf {Q}}_{\hat {\mathbf {v}}^{\prime }}\) separately!

In some applications one may seek to transform velocities \({\hat {\mathbf {v}}^{\prime }} = \hat {\mathbf {v}} + \mathbf {E}\dot {\mathbf {p}}\) in such a way that \({\hat {\mathbf {v}}}^{\prime T}{\mathbf {W}}_v {\hat {\mathbf {v}}^{\prime }}\) is minimized. The solution is obviously the special case of generalized inner constraints solution with \({\mathbf {x}}_0^{ref} = \mathbf {0}\), given by \({\hat {\mathbf {v}}^{\prime }} =\hat {\mathbf {v}} - \mathbf {E}({\mathbf {E}}^T{\mathbf {W}}_v \mathbf {E})^{ - 1}{\mathbf {E}}^T{\mathbf {W}}_v \hat {\mathbf {v}}\). This has covariance factor matrix
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{{\hat{\mathbf{v}}^{\prime}}} = [\mathbf{I} - \mathbf{E}({\mathbf{E}}^T{\mathbf{W}}_v \mathbf{E})^{ - 1}{\mathbf{E}}^T{\mathbf{W}}_v ]{\mathbf{Q}}_{\hat{\mathbf{v}}} [\mathbf{I} - \mathbf{E}({\mathbf{E}}^T{\mathbf{W}}_v \mathbf{E})^{ - 1}{\mathbf{E}}^T{\mathbf{W}}_v ]^T = \mathbf{GQ}_{\hat{\mathbf{v}}} {\mathbf{G}}^T. \end{aligned} $$
(200)
If one seeks to find the weight matrix Wv which minimizes \(tr{\mathbf {Q}}_{{\hat {\mathbf {v}}^{\prime }}}\), then following a procedure similar to that above he will come up with the answer \({\mathbf {W}}_v = {\mathbf {Q}}_{{\hat {\mathbf {v}}^{\prime }}}^{ - 1}\) and the solution
$$\displaystyle \begin{aligned} {\hat{\mathbf{v}}^{\prime}} = \hat{\mathbf{v}} - \mathbf{E}({\mathbf{E}}^T{\mathbf{Q}}_{{\hat{\mathbf{v}}^{\prime}}}^{ - 1} \mathbf{E})^{ - 1}{\mathbf{E}}^T{\mathbf{Q}}_{{\hat{\mathbf{v}}^{\prime}}}^{ - 1} \hat{\mathbf{v}}. \end{aligned} $$
(201)
This solution however, does not minimize \(tr{\mathbf {Q}}_{{\hat {\mathbf {v}}^{\prime }}}\) among all possible least squares solutions, but only within the smaller class of least squares solutions which minimize \({\hat {\mathbf {v}}}^{\prime T}{\mathbf {W}}_v {\hat {\mathbf {v}}^{\prime }}\) as Wv runs along all possible symmetric positive-definite weight matrices. Thus if \(tr{\mathbf {Q}}_{{\hat {\mathbf {v}}^{\prime }}} = \min \) is desired, one should use instead directly constraints ETv = dv, with arbitrary dv, or simply ETv = 0. For a solution satisfying \({\hat {\mathbf {v}}}^{\prime T}{\mathbf {W}}_v {\hat {\mathbf {v}}^{\prime }} = \min \) one should seek different criteria for the choice of weight matrix. Apart of the obvious choice Wv = I, a choice of WvI may be based on the need to put larger weights to the coordinate of stations with high quality results and to down-weight stations of lower quality. Another option is to assign larger weights to more stable stations that reflect overall plate or sub-plate behavior and smaller ones to stations affected by localized tectonic behavior.

10 Kinematic Minimal Constraints

All the above forms of minimal constraints are useful for adapting the solution of stacking to either a previous solution, or to some geophysical plate motion model in order to check its consistency with the observed data. However, they do not produce an optimal reference system, in the sense that the temporal variation of coordinates reflects only temporal variations in the network shape and not variations due solely to the choice of reference system at each epoch. We have already established in chapter 6 that an optimal reference system corresponds to a shortest geodesic in the subspace ⋃tSt of the space R3n of all the coordinates of the N station points. We have also seen that such a choice is in fact a discrete version of Tisserand’s choice of reference system based on the minimization of the relative kinetic energy of earth masses. In the most general case of arbitrary station variation xi(t) the defining conditions are the barycenter preservation condition
$$\displaystyle \begin{aligned} {\mathbf{x}}_B (t) = \frac{1}{n}\sum_i {{\mathbf{x}}_i (t) = {\mathbf{c}}_B},\qquad \forall t, \end{aligned} $$
(202)
(cB = constant) for the definition of the reference system origin and the vanishing relative angular momentum condition
$$\displaystyle \begin{aligned} \mathbf{h}(t) = \sum_i {[\{{\mathbf{x}}_i (t) - {\mathbf{x}}_B } (t)\}\times ]\frac{d\{{\mathbf{x}}_i (t) - {\mathbf{x}}_B (t)\}}{dt} = \mathbf{0},\quad \forall t, \end{aligned} $$
(203)
for the definition of the reference system orientation. Furthermore, we need a condition for the definition of the reference system scale. We choose to preserve the mean quadratic size of S(t) of the network, quantified by the mean quadratic value of the distances of all stations from their barycenter
$$\displaystyle \begin{aligned} S^2(t) \equiv \frac{1}{n}\sum_i {[{\mathbf{x}}_i (t) - {\mathbf{x}}_B } (t)]^T[{\mathbf{x}}_i (t) - {\mathbf{x}}_B (t)] = \frac{1}{n}C_S^2 ,\quad \forall t, \end{aligned} $$
(204)
(\(C_{S}^{2} = \mathrm {constant})\). Different values of the constants cB and \(C_S^2\) lead to different minimum length geodesics, which are though parallel in the sense that they are connected by time invariant transformation parameters. If a choice xi(t) satisfies the condition h(t) = 0, ∀t, then it can be easily established that a parallel reference system choice \(\tilde {\mathbf {x}}_i (t) = (1 + s)\mathbf {Rx}_i (t) + \mathbf {d}\) (s, R, d constant) has relative angular momentum \(\tilde {\mathbf {h}} = (1 + s)^2\mathbf {Rh} = \mathbf {0}\), which also vanishes. Thus, the orientation condition does hold for all parallel reference systems and a specific reference system can be obtained by a specific choice of the orientation at some initial epoch t0 along with the choice of the constants cB and \(C_S^2\).
When network station motion follows a linear-in-time model xi(t) = x0i + (t − t0)vi, the above three conditions become
$$\displaystyle \begin{aligned} &{\mathbf{x}}_B (t) = \bar{\mathbf{x}}_0 + (t - t_0 )\bar{\mathbf{v}} = {\mathbf{c}}_B,\qquad \forall t, \end{aligned} $$
(205)
$$\displaystyle \begin{aligned} &\mathbf{h} = \sum_i {[({\mathbf{x}}_{i0} - \bar{\mathbf{x}}_0 ) \times ]({\mathbf{v}}_i - \bar{\mathbf{v}})} , \end{aligned} $$
(206)
$$\displaystyle \begin{aligned} &S^2(t) = \frac{1}{n}\sum_i {({\mathbf{x}}_{0i} - \bar{\mathbf{x}}_0 )^T({\mathbf{x}}_{0i} - \bar{\mathbf{x}}_0 ) + 2(t - t_0 )} \frac{1}{n}\sum_i {({\mathbf{x}}_{0i} - \bar{\mathbf{x}}_0 )^T({\mathbf{v}}_i - \bar{\mathbf{v}})}\equiv\\ &\qquad \ \ {\equiv S_0 + 2(t - t_0 )S_1 = \frac{1}{n}C_S^2 } ,\quad \forall t, \end{aligned} $$
(207)
where
$$\displaystyle \begin{aligned} \bar{\mathbf{x}}_0 & = \frac{1}{n}\sum_i {{\mathbf{x}}_{0i} ,} \qquad \bar{\mathbf{v}} = \frac{1}{n}\sum_i {{\mathbf{v}}_i ,} \end{aligned} $$
(208)
$$\displaystyle \begin{aligned} S_0 & = \frac{1}{n}\sum_i {({\mathbf{x}}_{0i} - \bar{\mathbf{x}}_0 )^T({\mathbf{x}}_{0i} - \bar{\mathbf{x}}_0 )} = \frac{1}{n}\sum_i {{\mathbf{x}}_{0i}^T \,{\mathbf{x}}_{0i} - \bar{\mathbf{x}}_0^T \bar{\mathbf{x}}_0 ,} \end{aligned} $$
(209)
$$\displaystyle \begin{aligned} S_1 & = \frac{1}{n}\sum_i {({\mathbf{x}}_{0i} - \bar{\mathbf{x}}_0 )^T({\mathbf{v}}_i - \bar{\mathbf{v}})} = \frac{1}{n}\sum_i {{\mathbf{x}}_{0i}^T \,{\mathbf{v}}_i - \bar{\mathbf{x}}_0^T \bar{\mathbf{v}}.} \end{aligned} $$
(210)
The condition hR(t) = 0 already holds for any epoch. For xB(t) = cB and \(S^2(t) = C_s^2 / n\) to hold for any epoch it must hold that \(\bar {\mathbf {x}}_0 = {\mathbf {c}}_B\), \(\bar {\mathbf {v}} = \mathbf {0}\), \(S_0 = C_s^2 / n\) and S1 = 0. Thus we arrive at the nonlinear kinematic constraints
$$\displaystyle \begin{aligned} & \frac{1}{n}\sum_i {{\mathbf{x}}_{0i} = {\mathbf{c}}_B ,} \end{aligned} $$
(211a)
$$\displaystyle \begin{aligned} & \sum_i {{\mathbf{v}}_i = \mathbf{0},} \end{aligned} $$
(211b)
$$\displaystyle \begin{aligned} & \sum_i {[{\mathbf{x}}_{i\,0} \times ]{\mathbf{v}}_i - } \left( {\frac{1}{n}\sum_i {[{\mathbf{x}}_{0\,i} \times ]} } \right)\left( {\sum_i {{\mathbf{v}}_i } } \right) = 0, \end{aligned} $$
(211c)
$$\displaystyle \begin{aligned} & \sum_i {{\mathbf{x}}_{0\,i}^T {\mathbf{x}}_{0_i} } = \frac{1}{n}\left( {\sum_i {[{\mathbf{x}}_{0\,i} ]} } \right)^T\left( {\sum_i {{\mathbf{x}}_{0i} } } \right) = C_S^2, \end{aligned} $$
(211d)
$$\displaystyle \begin{aligned} & \frac{1}{n}\sum_i {{\mathbf{x}}_{0\,i}^T {\mathbf{v}}_i - } \left( {\frac{1}{n}\sum_i {[{\mathbf{x}}_{0\,i} ]} } \right)^T\left( {\frac{1}{n}\sum_i {{\mathbf{v}}_i } } \right) = \mathbf{0}, \end{aligned} $$
(211e)
which define, respectively, the initial origin, the origin rate, the orientation rate, the initial scale and the scale rate. While (211a) and (211b) are already linear, the rest are nonlinear with respect to the unknowns x0i, vi and they need to be linearized in order to obtain the desired linear kinematic constraints, which are the minimal constraints to be used. Using approximate values \({\mathbf {x}}_{0i}^{ap} \), \({\mathbf {v}}_i^{ap} \) and ignoring second order terms in the small quantities \(\delta {\mathbf {x}}_{0i} = {\mathbf {x}}_{0i} - {\mathbf {x}}_{0i}^{ap} \), \(\delta {\mathbf {v}}_i = {\mathbf {v}}_{0i} - {\mathbf {v}}_i^{ap}\), \({\mathbf {v}}_i^{ap}\) we obtain the kinematic constraints defining the orientation, origin and scale at the initial epoch
$$\displaystyle \begin{aligned} {\mathbf{C}}_x^{T} \delta {\mathbf{x}}_0 = \sum_i {{\mathbf{C}}_{xi}^{T} \delta {\mathbf{x}}_{0i}} = \sum_i {\left[ \begin{array}{c} {[{\mathbf{x}}_{0\,i}^{ap} \times ]} \\ {{\mathbf{I}}_3 } \\ {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_{0i}^{ap} )^T} \end{array} \right]} \delta {\mathbf{x}}_{0i} = \left[ \begin{array}{c} \mathbf{0} \\ {n({\mathbf{c}}_B - \bar{\mathbf{x}}_0^{ap} )} \\ {\frac{C_S^2 - \gamma^2}{2}} \end{array} \right] = {\mathbf{d}}_x , \end{aligned} $$
(212)
and the kinematic constraints defining the rates (temporal evolution) of orientation, origin and scale
$$\displaystyle \begin{aligned} {\mathbf{C}}_v^{T} \delta\mathbf{v} = \sum_i {{\mathbf{C}}_{vi}^{T} \delta{\mathbf{v}}_i } = \sum_i {\left[ \begin{array}{c} {[({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_{0\,}^{ap} )\times ]} \\ {{\mathbf{I}}_3 } \\ {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )^T} \end{array} \right]} \delta {\mathbf{v}}_i = \left[ \begin{array}{c} { - {\mathbf{h}}^{ap}} \\ { - n\bar{\mathbf{v}}^{ap}} \\ { - \kappa } \end{array} \right] = {\mathbf{d}}_v , \end{aligned} $$
(213)
where \(\overline {\mathbf {x}}_{0}^{ap}, \overline {\mathbf {v}}^{ap}\) are the means of the \({\mathbf {x}}_{0i}^{ap}, {\mathbf {v}}_{i}^{ap}\) values, respectively,
$$\displaystyle \begin{aligned} &\gamma^2 = nS_0^{ap} = \sum_i {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )}^T({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )\sum_i {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )}^T{\mathbf{x}}_{0\,i}^{ap} , \end{aligned} $$
(214)
$$\displaystyle \begin{aligned} & \kappa = nS_1^{ap} = \sum_i {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )}^T({\mathbf{v}}_{\,i}^{ap} - \bar{\mathbf{v}}^{ap}) = \sum_i {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )}^T{\mathbf{v}}_{\,i}^{ap} , \end{aligned} $$
(215)
$$\displaystyle \begin{aligned} & {\mathbf{h}}^{ap} = \sum_i {[({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )} \times ]({\mathbf{v}}_{\,i}^{ap} - \bar{\mathbf{v}}^{ap}) = \sum_i {[({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )} \times ]{\mathbf{v}}_{\,i}^{ap} . \end{aligned} $$
(216)
Replacing \(\delta {\mathbf {x}}_{0i} = {\mathbf {x}}_{0i} - {\mathbf {x}}_{0i}^{ap}\), \(\delta {\mathbf {v}}_i = {\mathbf {v}}_{0i} - {\mathbf {v}}_i^{ap}\), the linearized kinematic constraints can be converted into ones with respect to the original unknowns, those defining the reference system at the initial epoch
$$\displaystyle \begin{aligned} {\mathbf{C}}_x^T \mathbf{x} = \sum_i {{\mathbf{C}}_{xi}^T {\mathbf{x}}_{0i}} = \sum_i {\left[ \begin{array}{c} {[{\mathbf{x}}_{0\,i}^{ap} \times ]} \\ {{\mathbf{I}}_3 } \\ {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_{0i}^{ap} )^T} \end{array} \right]} {\mathbf{x}}_{0i} = \left[ \begin{array}{c} \mathbf{0} \\ {n{\mathbf{c}}_B } \\ {\frac{C_S^2 + \gamma^2}{2n}} \end{array} \right] = {\mathbf{d}}_x , \end{aligned} $$
(217)
and those defining the rate (temporal evolution) of the reference system
$$\displaystyle \begin{aligned} {\mathbf{C}}_v^T \mathbf{v} = \sum_i {{\mathbf{C}}_{vi}^T {\mathbf{v}}_i } = \sum_i {\left[ \begin{array}{c} {[({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_{0\,}^{ap} )\times ]} \\ {{\mathbf{I}}_3 } \\ {({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_0^{ap} )^T} \end{array} \right]} {\mathbf{v}}_i = \left[ \begin{array}{c} \mathbf{0} \\ \mathbf{0} \\ 0 \end{array} \right] = \mathbf{0} = {\mathbf{d}}_v . \end{aligned} $$
(218)
The above constraints have been completed with constraints defining the initial epoch orientation, which cannot be obtained from kinematic principles. We have chosen to borrow the missing constraints from the inner constraints, using the equivalent sets \(\sum \nolimits _i {[{\mathbf {x}}_{0\,i}^{ap} } \times ]\delta {\mathbf {x}}_{0i} = \mathbf {0}\) and \(\sum \nolimits _i {[{\mathbf {x}}_{0\,i}^{ap} } \times ]{\mathbf {x}}_{0i} = \mathbf {0}\). The constants cB, \(C_S^2\) can be chosen arbitrarily, thus providing different solutions among all possible parallel solutions, which are all optimal from the kinematic point of view. Any particular optimal solution depends on the choice of the reference system at the initial epoch, realized by the choice of cB for the initial origin and the choice of \(C_S^2\) for the initial scale. Initial orientation depends on the approximate values used in the initial orientation constraint borrowed from the inner constraints. As a possible choice of constants one can choose the mean values over the whole coordinate time series \({\mathbf {x}}_i^{obs} (t_k )\), k = 1, 2, …, m, i = 1, 2, …, n , i.e.,
$$\displaystyle \begin{aligned} {\mathbf{c}}_B & = \frac{1}{m}\sum_{k = 1}^m {{\mathbf{x}}_B^{obs} (t_k ) = \frac{1}{m}} \sum_{k = 1}^m {\left[ {\frac{1}{n}\sum_{i = 1}^n {{\mathbf{x}}_i^{obs} } (t_k )} \right]} , \end{aligned} $$
(219)
$$\displaystyle \begin{aligned} C_S^2 & = \frac{1}{m}\sum_{k = 1}^m {\sum_{i = 1}^n {[{\mathbf{x}}_i^{obs} (t_k ) - {\mathbf{x}}_B^{obs} (t_k )]^T[{\mathbf{x}}_i^{obs} (t_k ) - {\mathbf{x}}_B^{obs} (t_k )]} } . \end{aligned} $$
(220)
The total constraints have the form
$$\displaystyle \begin{aligned} {\mathbf{C}}_{tot}^T {\mathbf{x}}_{tot} = \left[ \begin{array}{c} {\mathbf{C}}_x^T {\mathbf{x}}_0 \\ {\mathbf{C}}_v^T \mathbf{v} \end{array} \right] = \left[ \begin{array}{ccc} {\mathbf{C}}_x^T & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & {\mathbf{C}}_v^T & \mathbf{0} \end{array} \right]\left[ \begin{array}{c} {\mathbf{x}}_0 \\ \mathbf{v} \\ \mathbf{z} \end{array} \right] = \left[ \begin{array}{c} {\mathbf{d}}_x \\ {\mathbf{d}}_v \end{array} \right] = {\mathbf{d}}_{tot} . \end{aligned} $$
(221)
Instead of applying the kinematic constraints directly, we can apply any other set of minimal constraints and then convert a posteriori the obtained least squares solution into the one with kinematic constraints. Applying the general conversion equation \({\hat {\mathbf {x}}^{\prime }} = \hat {\mathbf {x}} - \mathbf {E}({\mathbf {C}}^T\mathbf {E})^{ - 1}({\mathbf {C}}^T\hat {\mathbf {x}} - \mathbf {d})\), or in two steps \(\mathbf {t} = ({\mathbf {C}}^T\mathbf {E})^{ - 1}({\mathbf {C}}^T\hat {\mathbf {x}} - \mathbf {d})\), \({\hat {\mathbf {x}}^{\prime }} = \hat {\mathbf {x}} - \mathbf {Et}\), we have in our case
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{tot} = \left[ \begin{array}{c} {\hat{\mathbf{x}}^{\prime}}_0 \\ {\hat{\mathbf{v}}^{\prime}} \\ {\hat{\mathbf{z}}^{\prime}} \end{array} \right] = \hat{\mathbf{x}}_{tot} - {\mathbf{E}}_{tot} {\mathbf{t}}_{tot} = \left[ \begin{array}{c} \hat{\mathbf{x}}_0 - \mathbf{Et}_x \\ \hat{\mathbf{v}} - \mathbf{Et}_v \\ \hat{\mathbf{z}} - \mathbf{Jt}_x - {\mathbf{J}}_t {\mathbf{t}}_v \end{array} \right], \end{aligned} $$
(222)
where
$$\displaystyle \begin{aligned} {\mathbf{t}}_{tot} = \left[ \begin{array}{c} {\mathbf{t}}_x \\ {\mathbf{t}}_v \end{array} \right] = ({\mathbf{C}}_{tot}^T {\mathbf{E}}_{tot} )^{ - 1}({\mathbf{C}}_{tot}^T \hat{\mathbf{x}}_{tot} - {\mathbf{d}}_{tot} ) = \left[ \begin{array}{c} ({\mathbf{C}}_x^T \mathbf{E})^{ - 1}({\mathbf{C}}_x^T \hat{\mathbf{x}}_0 - {\mathbf{d}}_x ) \\ ({\mathbf{C}}_v^T \mathbf{E})^{ - 1}({\mathbf{C}}_v^T \hat{\mathbf{v}} - {\mathbf{d}}_v ) \end{array} \right]. \end{aligned} $$
(223)
With \({\mathbf {t}}_x = [({\mathbf {t}}_x^\theta )^T({\mathbf {t}}_x^d )^Tt_x^s ]^T\) and \({\mathbf {t}}_v = [({\mathbf {t}}_v^\theta )^T({\mathbf {t}}_v^d)^Tt_v^s]^T\) the above transformation takes the explicit form
$$\displaystyle \begin{aligned} &{\hat{\mathbf{x}}^{\prime}}_{0i} = \hat{\mathbf{x}}_{0i} - \left[ {{\mathbf{x}}_{0i}^{ap} \times } \right]{\mathbf{t}}_x^\theta - {\mathbf{t}}_x^d - t_x^s \,{\mathbf{x}}_{0i}^{ap} , \\ &{\hat{\mathbf{v}}^{\prime}}_i = \hat{\mathbf{v}}_i - \left[ {{\mathbf{x}}_{0i}^{ap} \times } \right]{\mathbf{t}}_v^\theta - {\mathbf{t}}_v^d - t_v^s \,{\mathbf{x}}_{0i}^{ap} , \\ &{\hat{\boldsymbol{\theta }}^{\prime}}_k = \hat{\boldsymbol{\theta }}_k - {\mathbf{t}}_x^\theta - (t_k - t_0 ){\mathbf{t}}_v^\theta , \\ &{\hat{\mathbf{d}}^{\prime}}_k = \hat{\mathbf{d}}_k - {\mathbf{t}}_x^d - (t_k - t_0 ){\mathbf{t}}_v^d , \\ &{\hat{s}}^{\prime}_k = \hat{s}_k + t_x^s - (t_k - t_0)t_v^s .{} \end{aligned} $$
(224)
The matrices
$$\displaystyle \begin{aligned} {\mathbf{C}}_x^T \mathbf{E} & = \sum_i \left[ \begin{array}{c} [{\mathbf{x}}_{0\,i}^{ap} \times ] \\ {\mathbf{I}}_3 \\ ({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_{0\,i}^{ap} )^T \end{array} \right] \left[ {[{\mathbf{x}}_{0\,i}^{ap} \times ]\ {\mathbf{I}}_3 \ {\mathbf{x}}_{0\,i}^{ap} } \right]=\\ & = \left[ \begin{array}{ccc} - \mathbf{C} + n[\bar{\mathbf{x}}_{0\,}^{ap} \times ]^2 & n[\bar{\mathbf{x}}_{0\,}^{ap} \times ] & \mathbf{0} \\ n[\bar{\mathbf{x}}_{0\,}^{ap} \times ] & n{\mathbf{I}}_3 & n\bar{\mathbf{x}}_{0\,}^{ap} \\ \mathbf{0} & \mathbf{0} & {\gamma^2} \end{array} \right], \end{aligned} $$
(225)
$$\displaystyle \begin{aligned} {\mathbf{C}}_v^T \mathbf{E} & = \sum_i \left[ \begin{array}{c} [({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_{0\,}^{ap} )\times ] \\ {\mathbf{I}}_3\\ ({\mathbf{x}}_{0\,i}^{ap} - \bar{\mathbf{x}}_{0\,}^{ap} )^T \end{array} \right] \left[ {[{\mathbf{x}}_{0\,i}^{ap} \times ]\ {\mathbf{I}}_3 \ {\mathbf{x}}_{0\,i}^{ap} } \right] = \left[ \begin{array}{ccc} - \mathbf{C} & \mathbf{0} & \mathbf{0} \\ n[\bar{\mathbf{x}}_{0}^{ap} \times ] & n{\mathbf{I}}_3 & n\bar{\mathbf{x}}_{0}^{ap} \\ \mathbf{0} & \mathbf{0} & {\gamma^2} \end{array} \right], \end{aligned} $$
(226)
have respective analytical inverses
$$\displaystyle \begin{aligned} ({\mathbf{C}}_x^T \mathbf{E})^{ - 1} &= \left[ \begin{array}{ccc} -{\mathbf{C}}^{-1} & {\mathbf{C}}^{-1}\left[ {\bar{\mathbf{x}}_0^{ap} \times } \right] & \mathbf{0} \\ \left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]{\mathbf{C}}^{ - 1} & \frac{1}{n}{\mathbf{I}}_3 - [\bar{\mathbf{x}}_0^{ap} \times ]{\mathbf{C}}^{ - 1}[\bar{\mathbf{x}}_0^{ap} \times ] & - \frac{1}{\gamma^2}\bar{\mathbf{x}}_0^{op} \\ \mathbf{0} & \mathbf{0} & \frac{1}{\gamma^2} \end{array} \right], \end{aligned} $$
(227)
$$\displaystyle \begin{aligned} ({\mathbf{C}}_v^T \mathbf{E})^{ - 1} & = \left[ \begin{array}{ccc} - {\mathbf{C}}^{-1} & \mathbf{0} & \mathbf{0} \\ \left[ {\bar{\mathbf{x}}_0^{ap} \times } \right]{\mathbf{C}}^{ - 1} & \frac{1}{n}{\mathbf{I}}_3 & - \frac{1}{\gamma^2}\bar{\mathbf{x}}_0^{op} \\ \mathbf{0} & \mathbf{0} & \frac{1}{\gamma^2} \end{array} \right], \end{aligned} $$
(228)
which can be used to compute \({\mathbf {t}}_x = [({\mathbf {t}}_x^\theta )^T({\mathbf {t}}_x^d )^Tt_x^s ]^T\) and \({\mathbf {t}}_v = [({\mathbf {t}}_v^\theta )^T({\mathbf {t}}_v^d )^Tt_v^s ]^T\) according to (223). Replacing the obtained values \({\mathbf {t}}_x^\theta \), \({\mathbf {t}}_x^d \), \(t_x^s\) and \({\mathbf {t}}_v^\theta \), \({\mathbf {t}}_v^d \), \(t_v^s\) in the transformation equations (224) we obtain the explicit formulas
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{0i} & = \hat{\mathbf{x}}_{0i} + [\Delta {\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{C}}^{ - 1}\sum_i {[\Delta {\mathbf{x}}_{0i}^{ap} \times ]} \,\hat{\mathbf{x}}_{0j} - \frac{1}{n}\sum_j {\hat{\mathbf{x}}_{0j} + {\mathbf{c}}_B}+\\ &\quad {+ \left[ {\frac{1}{2} - \frac{1}{\gamma^2}\sum_i {(\Delta {\mathbf{x}}_{0i}^{ap} )^T} \hat{\mathbf{x}}_{0j} + \frac{C_S^2 }{\gamma^2}} \right]} \,\Delta{\mathbf{x}}_{0i}^{ap} , \\ {\hat{\mathbf{v}}^{\prime}}_i & = \hat{\mathbf{v}}_i + [\Delta {\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{C}}^{ - 1}\sum_j {[\Delta {\mathbf{x}}_{0j}^{ap} \times ]} \,\hat{\mathbf{v}}_j - \frac{1}{n}\sum_j {\hat{\mathbf{v}}_j}-\\ &\quad {- \frac{1}{\gamma^2}\left[ {\sum_j {(\Delta {\mathbf{x}}_{0j}^{ap} )^T} \hat{\mathbf{v}}_j } \right]} \,\Delta{\mathbf{x}}_{0i}^{ap}, \\ {\hat{\boldsymbol{\theta }}^{\prime}}_k & = \hat{\boldsymbol{\theta }}_k + {\mathbf{C}}^{ - 1}\sum_j {[\Delta {\mathbf{x}}_{0j}^{ap}\times ][\hat{\mathbf{x}}_{0j} + (t_k - t_0 )\hat{\mathbf{v}}_j ] + n{\mathbf{C}}^{ - 1}} [\bar{\mathbf{x}}_0^{ap} \times ]{\mathbf{c}}_B , \\ {\hat{s}}^{\prime}_k & = \hat{s}_k + \frac{1}{2} - \frac{1}{\gamma^2}\sum_j {(\Delta {\mathbf{x}}_{0j}^{ap} )^T[\hat{\mathbf{x}}_{0j} + (t_k - t_0 )\hat{\mathbf{v}}_j ] + \frac{C_S^2 }{2\gamma^2},} \\ {\hat{\mathbf{d}}^{\prime}}_k & = \hat{\mathbf{d}}_k + [\bar{\mathbf{x}}_0^{ap} \times ](\hat{\boldsymbol{\theta }}_k - {\hat{\boldsymbol{\theta }}^{\prime}}_k ) - \frac{1}{n}\sum_j {[\bar{\mathbf{x}}_{0j} + (t_k - t_0 )\hat{\mathbf{v}}_j ] + {\mathbf{c}}_B + (\hat{s}_k - {\hat{s}}^{\prime}_k )} \bar{\mathbf{x}}_0^{ap} ,{} \end{aligned} $$
(229)
where \(\Delta {\mathbf {x}}_{0i}^{ap} = {\mathbf {x}}_{0i}^{ap} - \bar {\mathbf {x}}_0^{ap} \). Similar transformation equations can be derived for the case where the unknowns are the corrections to the original unknown parameters.
Replacing \(\hat {\mathbf {x}}_{0i} = {\mathbf {x}}_{0i}^{ap} + \delta \hat {\mathbf {x}}_{0i} \), \(\hat {\mathbf {v}}_i = {\mathbf {v}}_i^{ap} + \delta \hat {\mathbf {v}}_i \), \({\hat {\mathbf {x}}^{\prime }}_{0i} = {\mathbf {x}}_{0i}^{ap} + \delta {\hat {\mathbf {x}}^{\prime }}_{0i} \), \({\hat {\mathbf {v}}^{\prime }}_i = {\mathbf {v}}_i^{ap} + \delta {\hat {\mathbf {v}}^{\prime }}_i \), we obtain the conversion formulas
$$\displaystyle \begin{aligned} &\delta {\hat{\mathbf{x}}^{\prime}}_{0i} = \delta \hat{\mathbf{x}}_{0i} + [\Delta {\mathbf{x}}_{0i}^{ap} \times ]C^{ - 1}\sum_j {[\Delta {\mathbf{x}}_{0j}^{ap} \times ]\delta \hat{\mathbf{x}}_{0j} } - \frac{1}{n}\sum_j {\delta \hat{\mathbf{x}}_{0j} + ({\mathbf{c}}_B - \bar{\mathbf{x}}_0^{ap} ) - } \\ &\qquad - \left[ {\frac{1}{2} + \frac{1}{\gamma^2}\sum_k {(\Delta x_{0j}^{ap} )^T\delta \hat{\mathbf{x}}_{0k} + \frac{C_S^2 }{2\gamma^2}} } \right]\Delta {\mathbf{x}}_{0i}^{ap} , \\ &\delta {\hat{\mathbf{v}}^{\prime}}_i = \delta \hat{\mathbf{v}}_i + [\Delta {\mathbf{x}}_{0i}^{ap} \times ]{\mathbf{C}}^{ - 1}\sum_j {[\Delta {\mathbf{x}}_{0j}^{ap} \times ]\delta \hat{\mathbf{v}}_j } + [\Delta {\mathbf{x}}_{0j}^{ap} \times ]{\mathbf{C}}^{ - 1}{\mathbf{h}}^{ap}-\\ &\qquad - \frac{1}{n}\sum_j {\delta \hat{\mathbf{v}}_j - \bar{\mathbf{v}}^{ap} } - \frac{1}{\gamma^2}\left[ {\sum_k {(\Delta {\mathbf{x}}_{0j}^{ap} )^T\delta \hat{\mathbf{v}}_j + \kappa } } \right]\Delta {\mathbf{x}}_{0i}^{ap} , \\ &{\hat{\boldsymbol{\theta }}^{\prime}}_k = \hat{\boldsymbol{\theta }}_k + {\mathbf{C}}^{ - 1}\sum_j {[\Delta {\mathbf{x}}_{0j}^{ap} \times ][\delta \hat{\mathbf{x}}_{0j} + (t_k - t_0 )\delta \hat{\mathbf{v}}_j ] + n{\mathbf{C}}^{ - 1}} [\bar{\mathbf{x}}_0^{ap} \times ]{\mathbf{c}}_B + \\ &\qquad + (t_k - t_0 ){\mathbf{C}}^{ - 1}{\mathbf{h}}^{ap} \\ &{\hat{s}}^{\prime}_k = \hat{s}_k + \frac{1}{2} - \frac{1}{\gamma^2}\sum_j {(\Delta {\mathbf{x}}_{0j}^{ap} )^T[\delta \hat{\mathbf{x}}_{0j} + (t_k - t_0 )\delta\hat{\mathbf{v}}_j ] + \frac{C_S^2 }{2\gamma^2} - \frac{\kappa }{\gamma^2}(t_k - t_0 ),} \\ &{\hat{\mathbf{d}}^{\prime}}_k = \hat{\mathbf{d}}_k + [\bar{\mathbf{x}}_0^{ap} \times ](\hat{\boldsymbol{\theta }}_k - {\hat{\boldsymbol{\theta }}^{\prime}}_k ) - \frac{1}{n}\sum_j {[\delta \hat{\mathbf{x}}_{0j} + (t_k - t_0 )\delta \hat{\mathbf{v}}_j ] + ({\mathbf{c}}_B - \bar{\mathbf{x}}_0^{ap} )}-\\ &\qquad {- (t_k - t_0 )\bar{\mathbf{v}}^{ap} + (\hat{s}_k - {\hat{s}}^{\prime}_k )} \bar{\mathbf{x}}_0^{ap} .{} \end{aligned} $$
(230)

11 Transforming a Network Reference System into an (Approximate) Earth Reference System

The solution to the stacking problem defined by kinematic constraints is by far the optimal choice, at least from a purely geodetic point of view, because it does not exhibit temporal coordinate variations, which do not reflect variations in the network shape, but are caused by variations in the chosen spatiotemporal reference system. The other desired property, that of closeness to a pre-existing solution, e.g., that in an officially adopted reference frame, can also be satisfied by choosing the initial epoch reference system as desired, through the choice of values for the arbitrary constants and the approximate values for the initial epoch orientation constraint borrowed from the inner constraints. Thus optimality refers only to the temporal evolution (rate) of the reference system and in particular to the velocity estimates, which are the smallest possible in the sense of their mean quadratic modulus. The only other choice for the velocity estimates that should be used, is the ones closest to a given geophysical model for the sole purpose of investigating its compatibility with the available data. From the geodetic point of view, the kinematic constraints solution is the best solution as a basis for global, regional and even local mapping.

Nevertheless, even such an optimal solution has its drawbacks, which are related to the discrete nature of the geodetic networks, the dependence on the particular design, and of course, most of all, the lack of coverage over the oceans. A truly optimal reference system in the sense of Tisserand, should refer to the temporal variation of the continuous mass distribution within the whole earth. Such knowledge is beyond the reach of geodetic observational techniques, but a reasonable compromise can be limited to the lithosphere of the earth. The basic idea is to use the knowledge of the variation in the geometric configuration of networks on the earth surface in order to infer corresponding variations within the lithosphere and in particular the relative motion of plates and sub-plates. In the following, we will use the term plate in a more general sense that covers not also sub-plates but even any region that exhibits different kinematic behavior from its neighbor ones.

Starting with an existing spatiotemporal coordinate system, described by the coordinate functions xi(t) of a global geodetic network, we will convert into an optimal one \(\tilde {\mathbf {x}}_i (t)\) as follows: First an optimal discrete Tisserand reference system will be separately established for each subnetwork DK covering a corresponding plate PK. Next the derived motion of each of these local reference systems will be assumed to correspond to the motion of all points within the corresponding plate thus allowing us to compute the angular momentum of the lithosphere, as the sum of those for each plate. Finally, the rigid transformation will be sought, which leads to a Tisserand reference system for the lithosphere where the angular momentum of the lithosphere will vanish as required.

In doing so, we must first determine how the angular momentum varies under a time-dependent rigid transformation. The angular momentum in the original reference system is given by
$$\displaystyle \begin{aligned} \mathbf{h} = \int {[(\mathbf{x} - \mathbf{m})\times ]} (\dot{\mathbf{x}} - \dot{\mathbf{m}})dm, \end{aligned} $$
(231)
where \(M = \int {dm}\) is the mass and \(\mathbf {m} = \frac {1}{M}\int {\mathbf {x}\,dm} \) the barycenter of the body in question. A rigid transformation \(\tilde {\mathbf {x}} = \mathbf {Rx} + \mathbf {d}\) will result into \(\tilde {\mathbf {m}} = \mathbf {Rm} + \mathbf {d}\), \(\dot {\tilde {\mathbf {x}}} = \dot {\mathbf {R}}\mathbf {x} + \mathbf {R}\dot {\mathbf {x}} + \dot {\mathbf {d}}\) and \(\dot {\tilde {\mathbf {m}}} = \dot {\mathbf {R}}\mathbf {m} + \mathbf {R}\dot {\mathbf {m}} + \dot {\mathbf {d}}\), \(\tilde {\mathbf {x}} - \tilde {\mathbf {m}} = \mathbf {R}(\mathbf {x} - \mathbf {m})\), \(\dot {\tilde {\mathbf {x}}} - \dot {\tilde {\mathbf {m}}} =\)\(\dot {\mathbf {R}}(\mathbf {x} - \mathbf {m}) + \mathbf {R}(\dot {\mathbf {x}} - \dot {\mathbf {m}})\). The corresponding rotation vector is given by \([\tilde {\boldsymbol {\upomega }}\times ] = \mathbf {R}\dot {\mathbf {R}}^T\) in the new system and by \(\boldsymbol {\upomega } = {\mathbf {R}}^T\tilde {\boldsymbol {\upomega }}\) in the original system so that
$$\displaystyle \begin{aligned}{}[\boldsymbol{\upomega }\times ] = \dot{\mathbf{R}}^T\mathbf{R} = - {\mathbf{R}}^T\dot{\mathbf{R}}, \end{aligned} $$
(232)
with the last term following from the differentiation of RTR = I. Therefore
$$\displaystyle \begin{aligned} \dot{\mathbf{R}} = - \mathbf{R}[\boldsymbol{\upomega }\times ],\qquad \dot{\tilde{\mathbf{x}}} - \dot{\tilde{\mathbf{m}}} = \mathbf{R}[(\mathbf{x} - \mathbf{m})\times ]\boldsymbol{\upomega } + \mathbf{R}(\dot{\mathbf{x}} - \dot{\mathbf{m}}). \end{aligned} $$
(233)
Replacing into the expression for the angular momentum in the new system we get
$$\displaystyle \begin{aligned} \tilde{\mathbf{h}} &= \int {[(\tilde{\mathbf{x}} - \tilde{\mathbf{m}})\times ]} (\dot{\tilde{\mathbf{x}}} - \dot{\tilde{\mathbf{m}}})dm = \\ &= \int {\mathbf{R}[(\mathbf{x} - \mathbf{m})\times ]{\mathbf{R}}^T\{\mathbf{R}[(\mathbf{x} - \mathbf{m})\times ]\boldsymbol{\upomega }} {+ \mathbf{R}(\dot{\mathbf{x}} - \dot{\mathbf{m}})\}} dm =\\ & = \mathbf{R}\int {[(\mathbf{x} - \mathbf{m})\times ]}^2dm\boldsymbol{\upomega} + \mathbf{R}\int {[(\mathbf{x} - \mathbf{m})\times ]} (\dot{\mathbf{x}} - \dot{\mathbf{m}})dm. \end{aligned} $$
(234)
Recognizing the original angular momentum in the second term and
$$\displaystyle \begin{aligned} \mathbf{C} = - \int {[(\mathbf{x} - \mathbf{m})\times ]^2dm,}\end{aligned} $$
(235)
as the inertia matrix of the body in the original reference system, we obtain the law of transformation of angular momentum
$$\displaystyle \begin{aligned} \tilde{\mathbf{h}} = \mathbf{R}(\mathbf{h} - \mathbf{C}\boldsymbol{\upomega}). \end{aligned} $$
(236)
The same law holds true for the discrete angular momentum of geodetic networks as it can be easily seen if integration is replaced by summation in the above derivation.
We will next seek the rigid transformation xi = Rxi + d from the original reference system to the optimal one for a particular sub-network DK. Denoting by
$$\displaystyle \begin{gathered} {} {\mathbf{h}}_{D_K } = \sum_{i \in D_K } {[({\mathbf{x}}_i - \bar{\mathbf{x}}_B )\times ](\dot{\mathbf{x}}_i - \dot{\bar{\mathbf{x}}}_B ),} \end{gathered} $$
(237)
$$\displaystyle \begin{gathered} {} {\mathbf{C}}_{D_K } = - \sum_{i \in D_K } {[({\mathbf{x}}_i - \bar{\mathbf{x}}_B )\times ]^2,} \end{gathered} $$
(238)
$$\displaystyle \begin{gathered} {} [\boldsymbol{\upomega}_K \times ] = {\dot{\mathbf{R}}}^{\prime T}\mathbf{{R}^{\prime}} = - \mathbf{{R}}^{\prime T}{\dot{\mathbf{R}}^{\prime}}, \end{gathered} $$
(239)
the angular momentum, the inertia matrix, and the rotation vector, respectively, of the subnetwork in the original reference system, application of the above transformation law gives \(\mathbf {{h}^{\prime }}_{D_K } = \mathbf {{R}^{\prime }}({\mathbf {h}}_{D_K } - {\mathbf {C}}_{D_K } \boldsymbol {\upomega }_K )\). For optimality with respect to orientation it must hold that \(\mathbf {{h}^{\prime }}_{D_K } = \mathbf {0}\), in which case the rotation vector can be computed from
$$\displaystyle \begin{aligned} \boldsymbol{\upomega}_K = {\mathbf{C}}_{D_K }^{ - 1} {\mathbf{h}}_{D_K } .\end{aligned} $$
(240)
Optimality with respect to the origin is established from
$$\displaystyle \begin{aligned} {\dot{\mathbf{m}}^{\prime}} = {\dot{\mathbf{R}}^{\prime}\mathbf{m}} + \mathbf{{R}}^{\prime}\dot{\mathbf{m}} + {\dot{\mathbf{d}}^{\prime}} = \mathbf{{R}^{\prime}}[\mathbf{m}\,\times ]\boldsymbol{\upomega}_K + \mathbf{{R}}^{\prime}\dot{\mathbf{m}} + {\dot{\mathbf{d}}^{\prime}} = \mathbf{0}. \end{aligned} $$
(241)
The optimal rigid transformation elements R and d can be determined from solving the differential equation \({\dot {\mathbf {R}}^{\prime }} = - \mathbf {{R}^{\prime }}[\boldsymbol {\upomega }_K \times ]\) for the R(θ) parameters θ and then solve the differential equation \({\dot {\mathbf {d}}^{\prime }} = - \mathbf {{R}^{\prime }}([\mathbf {m}\times ]\boldsymbol {\upomega }_K + \dot {\mathbf {m}})\) for d. Fortunately, we will not need to do this.
At this point we assume that the plate PK moves rigidly in the way described by the best fitted reference system to its subnetwork DK, which means that the plate points do not move with respect to this system and thus \({\dot {\mathbf {x}}^{\prime }} = \mathbf {0}\) for all points in PK. This means that the residual apparent velocities vi are indeed small and can be either attributed to the effect of observational errors, or to secondary within plate deformations of secondary importance. From \({\dot {\mathbf {x}}^{\prime }} = \mathbf {0}\) follows that the angular momentum also vanishes in this reference system, i.e., that \(\mathbf {{h}^{\prime }}_{P_K } = \mathbf {0}\). Since by the law of angular momentum transformation \(\mathbf {{h}^{\prime }}_{P_K } = \mathbf {{R}^{\prime }}({\mathbf {h}}_{P_K } - {\mathbf {C}}_{P_K } \boldsymbol {\upomega }_K ) = \mathbf {0}\) it follows that \({\mathbf {h}}_{P_K } = {\mathbf {C}}_{P_K } \boldsymbol {\upomega }_K \). In order to determine the rigid transformation \(\tilde {\mathbf {x}} = \mathbf {Rx} + \mathbf {d}\) which transforms the original system to an optimal system we note that by the law of transformation \(\tilde {\mathbf {h}}_{P_K } = \mathbf {R} ({\mathbf {h}}_{P_K } - {\mathbf {C}}_{P_K } \boldsymbol {\upomega })\), with ω now being the rotation vector of the rigid transformation expressed in the original reference system. Optimality with respect to orientation is achieved by setting the total angular momentum of the lithosphere equal to zero
$$\displaystyle \begin{aligned} \tilde{\mathbf{h}}_{\cup_{K} P_K } = \sum_k \ {\tilde{\mathbf{h}}_{P_K } } = \sum_k\mathbf{R} ({\mathbf{h}}_{P_K} - {\mathbf{C}}_{P_K} \boldsymbol{\upomega}) = \mathbf{R}\sum_k \ {({\mathbf{C}}_{Pk} \boldsymbol{\upomega}_K - {\mathbf{C}}_{P_k } \boldsymbol{\upomega }) = \mathbf{0}.} \end{aligned} $$
(242)
Solving the above equation for ω it follows that
$$\displaystyle \begin{aligned} \boldsymbol{\upomega } = \left( {\sum_K {{\mathbf{C}}_{P_K } } } \right)^{ - 1}\sum_K {{\mathbf{C}}_{P_K } \boldsymbol{\upomega}_K } = \left( {\sum_K {{\mathbf{C}}_{P_K } } } \right)^{ - 1}\sum_K {{\mathbf{C}}_{P_K } } {\mathbf{C}}_{D_K }^{ - 1} {\mathbf{h}}_{D_K ,} \end{aligned} $$
(243)
and the rotation vector ω is in fact a weighted mean of the rotation vectors ωK of the reference systems that are best fitted to the subnetworks DK. For optimality with respect to the origin we require that \(\dot {\tilde {\mathbf {m}}} = \mathbf {0}\). Obviously \(\tilde {\mathbf {m}} = \mathbf {Rm} + \mathbf {d}\), where \(\mathbf {m} = \sum \limits _K {{\mathbf {m}}_K}\) and \({\mathbf {m}}_K = \frac {1}{M_K}\int \nolimits _{P_K} {\mathbf {x}\,dm} \) with \(M_K = \int \nolimits _{P_K} {dm} \) being the mass of plate PK. Consequently \(\dot {\tilde {\mathbf {m}}} = \dot {\mathbf {R}}\mathbf {m} + \mathbf {R}\dot {\mathbf {m}} + \dot {\mathbf {d}} = \mathbf {R}[\mathbf {m}\,\times ]\boldsymbol {\upomega } + \mathbf {R}\dot {\mathbf {m}} + \dot {\mathbf {d}} = \mathbf {0}\) and the elements R, d of the rigid transformation leading to the optimal reference system can be determined from one of the solutions of the differential equations
$$\displaystyle \begin{aligned} \dot{\mathbf{R}} = - \mathbf{R}[\boldsymbol{\upomega }\times ],\qquad \dot{\mathbf{d}} = - \mathbf{R}([\mathbf{m}\times ]\boldsymbol{\upomega } + \dot{\mathbf{m}}). \end{aligned} $$
(244)
If we consider a representation in terms of rotation angles around the axes R(θ) = R3(θ3)R2(θ2)R1(θ1) and set \([\boldsymbol {\upomega }_j \times ] = - {\mathbf {R}}^T\frac {\partial \mathbf {R}}{\partial \theta _j}\), it follows that \(\boldsymbol {\upomega } = \dot {\boldsymbol {\uptheta } }_1 \boldsymbol {\upomega }_1 + \dot {\boldsymbol {\uptheta } }_2 \boldsymbol {\upomega }_2 +\dot {\boldsymbol {\uptheta } }_3 \boldsymbol {\upomega }_3 = \boldsymbol {\Omega } \boldsymbol {\dot {\uptheta }}\) where Ω = [ω1ω2ω3] and we must solve first the differential equation
$$\displaystyle \begin{aligned} \boldsymbol{\dot{\uptheta }} = {\boldsymbol{\Omega} }^{ - 1}\boldsymbol{\upomega }, \end{aligned} $$
(245)
e.g., by numerical integration, and then use the solution θ in solving next \(\dot {\mathbf {d}} = - \mathbf {R}(\boldsymbol {\uptheta })([\mathbf {m}\,\times ]\,\boldsymbol {\upomega } + \dot {\mathbf {m}})\). For the usual representation R(θ) = R3(θ3)R2(θ2)R1(θ1) with rotations around the axes, the desired matrix becomes
$$\displaystyle \begin{aligned} {\boldsymbol{\Omega}}^{ - 1} &= \left[ \begin{array}{ccc} 1 & { - \sin \,\theta_1 } & {\sin \,\theta_2 } \\ 0 & {\cos \,\theta_1 } & { - \sin \theta_1 \cos \theta_2 } \\ 0 & 0 & {\cos \theta_1 \cos \theta_2 } \end{array} \right]^{ - 1} = \left[ \begin{array}{ccc} 1 & {\tan^2\,\theta_1 } & {\tan^2\,\theta_1 - \frac{\tan \theta_2 }{\cos \theta_1 }} \\ 0 & {\frac{1}{\cos \,\theta_1 }} & {\frac{\tan \theta_1 }{\cos \theta_1 }} \\ 0 & 0 & {\frac{1}{\cos \theta_1 \cos \theta_2 }} \end{array} \right]\approx\\ &\approx \left[ \begin{array}{ccc} 1 & {\theta_1 } & { - \theta_2 } \\ 0 & 1 & {\theta_1 } \\ 0 & 0 & 1 \end{array} \right], \end{aligned} $$
(246)
the last term being an acceptable approximation, since we are concerned here with transformations close to the identity and we can neglect second order terms in the small rotation angles θ1, θ2, θ3. With this value the differential equations splits in two parts
$$\displaystyle \begin{aligned} \dot{\theta }_3 = \omega_3 ,\quad \left[ \begin{array}{c} {\dot{\theta }_1 } \\ {\dot{\theta }_2 } \end{array} \right] = \left[ \begin{array}{cc} {\omega_2 } & { - \omega_3 } \\ {\omega_3 } & 0 \end{array} \right]\left[ \begin{array}{c} {\theta_1 } \\ {\theta_2 } \end{array} \right] + \left[ \begin{array}{c} {\omega_1 } \\ {\omega_1 } \end{array} \right]. \end{aligned} $$
(247)
The first has solution θ3 = θ2,0 + (t − t0)ω3 while the second is a system of first order differential equations \(\boldsymbol {\dot {\uptheta }} = \mathbf {A\boldsymbol {\uptheta } } + \mathbf {b}\) with constant coefficients that can be solved by standard mathematical techniques.
The above approach is general enough as it considers any possible motion of each plate. However, plates are primarily known to “float” over the mantle, without any “bending” in the sense of rotating around a horizontal axis passing through their barycenter. The only other possible motion is vertical motion associated with post-glacial uplift. It seems a good idea to consider only the horizontal components of sub-network point velocities in our analysis and to implement only horizontal angular momentums. However, this is of no concern, since vertical velocity components do not contribute to the angular momentum. Indeed if we analyze each barycentric velocity \({\mathbf {v}}_i - \dot {\mathbf {m}} = \dot {\mathbf {x}}_i - \dot {\mathbf {m}}\) into a horizontal and a radial-vertical part as \({\mathbf {v}}_i - \dot {\mathbf {m}} = {\mathbf {v}}_i^H + {\mathbf {v}}_i^V \), where \({\mathbf {v}}_i^V || {\mathbf {x}}_i - \mathbf {m}\) and \({\mathbf {v}}_i^H \bot {\mathbf {x}}_i - \mathbf {m}\), then since \([({\mathbf {x}}_i - \mathbf {m})\times ]({\mathbf {v}}_i^M - \dot {\mathbf {m}}) = \mathbf {0}\) it follows that the angular momentum
$$\displaystyle \begin{aligned} \mathbf{h} = \int {[(\mathbf{x} - \mathbf{m})\times ]} (\dot{\mathbf{x}} - \dot{\mathbf{m}})dm = \int {[(\mathbf{x} - \mathbf{m})\times ]} {\mathbf{v}}_i^H dm, \end{aligned} $$
(248)
involves only horizontal velocities and we do not need to worry about, vertical motions which are additionally of lower accuracy that the horizontal ones, at least for GPS/GNSS networks.

A first simplification is to consider only “floating” plate motions which are described by a time varying rotation R(t) (xi = Rxi, \(\tilde {\mathbf {x}} = \mathbf {Rx})\) around a time varying rotation vector ω(t), i.e., rotation about a migrating axis with varying angular velocity. Repeating the above procedure without the translation part, we arrive exactly at the same Eq. (243) and we need only to solve the differential equation \(\boldsymbol {\dot {\uptheta }} = {\boldsymbol {\Omega } }(\boldsymbol {\uptheta })^{ - 1}\boldsymbol {\upomega }\) to determine θ(t) and apply the transformation \(\tilde {\mathbf {x}}_i (t) =\)R(θ(t))xi(t) to the optimal lithosphere related reference system.

The next step towards simplification is to assume that plates rotate around a fixed axis with constant angular velocity, i.e., with time independent rotation vectors ωK. Then the rotation induced “horizontal” velocities vω,i = [ωK×]xi = −[xi×]ωK should be fitted in the least squares sense to the horizontal components \({\mathbf {v}}_i^H \) of the observed constant velocities vi, within each sub-network DK. We seek to minimize
$$\displaystyle \begin{aligned} \phi &= \sum_{i \in D_k } {({\mathbf{v}}_{i,\omega } - {\mathbf{v}}_i^H )}^T({\mathbf{v}}_{i,\omega } - {\mathbf{v}}_i^H ) =\\ & = \sum_{i \in D_k } {( - [{\mathbf{x}}_i \times ]\boldsymbol{\upomega}_K - {\mathbf{v}}_i^H )^T} ( - [{\mathbf{x}}_i \times ]\boldsymbol{\upomega}_K - {\mathbf{v}}_i^H ) = \mathop{\min }\limits_{\boldsymbol{\upomega}_K } . \end{aligned} $$
(249)
It can be easily shown that
$$\displaystyle \begin{aligned} \phi & = - \boldsymbol{\upomega}_K^T \sum_{i \in D_k } {[{\mathbf{x}}_i \times ]^2\boldsymbol{\upomega}_K - 2\boldsymbol{\upomega}_K^T \sum_{i \in D_k } {[{\mathbf{x}}_i \times ]} {\mathbf{v}}_i^H + ({\mathbf{v}}_i^H )}^T{\mathbf{v}}_i^H=\\ &= \boldsymbol{\upomega}_K^T {\mathbf{C}}_{D_K } \boldsymbol{\upomega}_K - 2\boldsymbol{\upomega}_K^T {\mathbf{h}}_{D_k }^H + ({\mathbf{v}}_i^H )^T{\mathbf{v}}_i^H , \end{aligned} $$
(250)
where \({\mathbf {h}}_{D_k }^H = \sum \limits _{i \in D_k } {[{\mathbf {x}}_i \times ]} {\mathbf {v}}_i^H = \sum \limits _{i \in D_k } {[{\mathbf {x}}_i \times ]} {\mathbf {v}}_i \) is the non-barycentric horizontal angular momentum, and the minimum is obtained from \(\frac {\partial \phi }{\partial \boldsymbol {\upomega }_K } = \mathbf {0}\), which gives
$$\displaystyle \begin{aligned} \boldsymbol{\upomega}_K = {\mathbf{C}}_{D_K }^{ - 1} {\mathbf{h}}_{D_K }^H . \end{aligned} $$
(251)
The only difference with the general approach is that the barycentric angular momentum \({\mathbf {h}}_{D_k}\) has been replaced by the non-barycentric angular momentum \({\mathbf {h}}_{D_k }^H\). For the standard linear-in-time model xi(t) = x0i + (t − t0)vi the non-barycentric angular momentums becomes \({\mathbf {h}}_{D_k}^H = \sum \limits _{i \in D_k } {[{\mathbf {x}}_{0i} \times ]{\mathbf {v}}_i } \), while \({\mathbf {h}}_{D_K } = \sum \limits _{i \in D_k } {[{\mathbf {x}}_{0i} \times ]{\mathbf {v}}_i - n[\bar {\mathbf {x}}_0 \times ]} \bar {\mathbf {v}}\).
In this particular case, where plate motion rotation around fixed axes with constant angular velocities have been assumed, the constant plate rotation axes ωK result in a constant rotation axis ω for the transformation from the original to the optimal reference system, if the small variations in the inertia matrices of the plates are disregarded. Therefore the transformation will cause each point to move on a circle in a plane perpendicular to ω. For small time intervals the transformation \(\tilde {\mathbf {x}}_i = \mathbf {Rx}_i \) can be simply approximated by \(\tilde {\mathbf {x}}_i = {\mathbf {x}}_i + (t - t_0 )[\boldsymbol {\upomega }\times ]{\mathbf {x}}_i \). For the linear-in-time model xi = x0 i + (t − t0)vi this results in the transformations
$$\displaystyle \begin{aligned} \tilde{\mathbf{x}}_{0i} = {\mathbf{x}}_{0i} ,\qquad \tilde{\mathbf{v}}_i = {\mathbf{v}}_i + [\boldsymbol{\upomega }\times ]{\mathbf{x}}_{0\,i} + (t - t_0 )[\boldsymbol{\upomega }\times ]{\mathbf{v}}_i \approx {\mathbf{v}}_i + [{\mathbf{x}}_{0\,i}^{ap} \times ]\boldsymbol{\upomega }, \end{aligned} $$
(252)
neglecting second order terms as usually. We see that with this simplified but nevertheless reasonable plate rotation model, only the velocities of a global reference frame need to be modified.

The transformation to an optimal reference system that best fits the lithosphere, rather than a particular global geodetic network, faces two difficulties in its realization. The first is related to the lack of a dense high quality geodetic network over the lands, but the emergence of continuously improving GPS/GNSS networks will eventually resolve this problem. The second and more serious one is the lack of coverage over the oceans, where information is limited to isolated stations on islands. Inevitably, one has to rely on this case on geophysical models for the identification and rotation of plates and sub-plates. In any case, conversion to a reference system best fitted to the lithosphere continuum, is important, not only for liberating the choice from a particular discrete geodetic network configuration, but also for comparison with earth rotation theory. Earth orientation parameters determined by geodetic space techniques refer, or are converted, to the adopted reference system, while theories provide earth orientation parameters referring to the primarily to the lithosphere, or to the mantle to be more precise, because they incorporate different rotations for the earth’s mantle and core. The basic ideas of the approach presented here were developed by Dermanis [26].

12 Formulation of the International Terrestrial Reference Frame: Introductory Remarks

Data from four geodetic space techniques are jointly analyzed in order to obtain estimates of parameters a, describing the temporal variation of the coordinates xi = xi(t, a) of selected stations i of a global geodetic network. The estimates \(\hat {\mathbf {a}}\) of the coordinate model parameters a, which depend on the choice of a spatiotemporal reference system in the final combined solution for all space techniques, comprise the International Terrestrial Reference Frame (ITRF), which is a practical realization of the International Terrestrial Reference System (ITRS), see [1, 3, 4, 5, 6, 7, 8, 9, 58, 60]. The official ITRF is the responsibility of the IERS (International Earth Rotation and Reference Systems Service), which is a joint Service of the IAG (International Association of Geodesy) and IAU (International Astronomic Union). The dominant ITRF parameters (and until recently the only ones) are the initial epoch coordinates x0i and the constant velocities vi of the linear-in-time model xi(t) = x0i + (t − t0)vi. Only the last version ITRF-2014 [7] includes additional parameters for non-linear annual and semi-annual periodic variations, as well as models for post-seismic variation.

Four techniques are involved in the ITRF formulation: Very Long Baseline Interferometry (VLBI), Satellite Laser Ranging (SLR), Doppler Orbitography and Radiopositioning Integrated by Satellite (DORIS) and GPS (Global Positioning System). We will use the term GPS here in a wider sense that will also cover satellite positioning systems other than the U.S. “GPS” brand name, namely the Russian GLONASS, the European GALILEO, the Chinese BeiDou and other existing or emerging regional systems. We prefer here the term GPS, merely for its simplicity and merely because the official adopted term Global Navigation Satellite System (GNSS) does not pay proper tribute to the efforts of the geodetic community, which are directed towards very high accuracy positioning and have little to do with low accuracy “navigation”. All four techniques provide positioning information in terms of single epoch solutions, with different characteristics and qualities in their reference system definition. They also provide different information additional to that of positioning. SLR is the only technique that provides information relating to the gravitational field of the earth. The only part of this information that relates to the ITRF are the first order spherical harmonic coefficients \(C_{10} = \mu x_G^3 \), \(C_{11} = \mu x_G^1 \), \(S_{11} = \mu x_G^2 \), which are related to the coordinates \(x_G^1 \), \(x_G^2 \), \(x_G^3 \) of the geocenter (center of mass of the earth). The parameter μ = GM, is the product of the universal gravitational constant G and the mass M of the earth. VLBI is the only technique that provides EOPs related to precession nutation, since it observes extragalactic radio sources with fixed directions with respect to the International Celestial Reference System (ICRS), which is realized by the International Celestial Reference Frame (ICRF), a catalogue of the direction angles of selected radio sources. VLBI mainly senses the geometric relation between the ITRS and the ITRF as realized by three parameters (rotation angles) of the instantaneous rotation matrix. The separation into precession-nutation on one hand and polar motion and length-of-the day (equivalent to angular velocity of rotation) is realized by modelling, which forms the basis for EOP estimates that are claimed to relate to the CIP and not to the instantaneous rotation axis. The other techniques are tracking satellites with orbits evolving according to the laws of dynamics as described in an inertial or quasi-inertial reference system such as the ICRS. Description of orbits with respect to the ITRS introduces the pseudo forces related to the instantaneous earth rotation vector (see previous section chapter 2). Thus polar motion and length-of-the-day EOPs are produced, that are also claimed to relate to the CIP rather than the instantaneous earth rotation vector, which appears in the pseudoforce relations (Eq. 13). Strictly speaking, the earth rotation vector, is a type of velocity with respect to the time continuous rotation matrix (recall the generalized Euler kinematic equations \([\boldsymbol {\upomega }\times ] = \mathbf {R}\,\dot {\mathbf {R}}^T)\) while the available data are discrete in time. Therefore, some type of interpolation with respect to time is required to obtain EOP estimates. Usually a linear-in-time model is assumed for EOPs within the small time interval of either the observations which will provide a “single epoch” solution, or for daily subintervals. Thus the obtained EOPs are some type of averages over a small time span and thus higher frequency components of earth rotation remain undetectable. In any case the question of what type of EOPs are obtained from the space techniques requires further theoretical investigation, especially since higher temporal resolution data make possible the sub-daily determination of earth rotation, thus leading us to eventually abandoning the CIP for a rotation pole closer to the instantaneous earth rotation (see e.g., [9, 10, 17]).

All space techniques have their own scale as defined by their own set of atomic clocks. Nevertheless, scale information from VLBI and SLR is considered to be of much higher quality and the ITRF initial scale and its rate are weighted combinations of those from VLBI and SLR.

Before proceeding with the data combination one must have a clear view of their actual deficiencies and weaknesses with their respect to their adopted reference system, for each technique and each single epoch solution. The already mentioned (chapter 7) three indices ωQ, ψq and \(\chi _{Q,\min } / \chi _{Q,\max } \) developed by Chatzinikos and Dermanis [21] can be used to this purpose, in order not to rely on theoretical considerations only. Their application requires knowledge of the normal equation coefficient matrices, which are also necessary for the final combined all-techniques solution. Only VLBI provides the normal equations for its solutions. GPS and DORIS provide covariance matrices obtained with the use of minimal constraints, which allow the recovery of the normal equation matrices. Taking advantage of the conversion of a minimally constraint solution \(\hat {\mathbf {x}}_C \) to one with inner constraints
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_E = [\mathbf{I} - \mathbf{E}({\mathbf{E}}^T\mathbf{E})^{ - 1}\mathbf{E}]\hat{\mathbf{x}}_C \equiv \mathbf{H}\hat{\mathbf{x}}_C , \end{aligned} $$
(253)
we can convert the covariance factor matrices to
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{\hat{\mathbf{x}}_E } = \mathbf{HQ}_{\hat{\mathbf{x}}_C } {\mathbf{H}}^T, \end{aligned} $$
(254)
and then take advantage of Eq. (91) to recover
$$\displaystyle \begin{aligned} \mathbf{N} = [{\mathbf{Q}}_{\hat{\mathbf{x}}_E } + \mathbf{E}({\mathbf{E}}^T\mathbf{E})^{- 2}{\mathbf{E}}^T]^{-1} - \mathbf{EE}^T. \end{aligned} $$
(255)

To obtain\({\mathbf {Q}}_{\hat {\mathbf {x}}_C } \) from the given covariance matrix estimate \(\hat {\mathbf {C}}_{\hat {\mathbf {x}}_C } = \hat {\sigma }^2{\mathbf {Q}}_{\hat {\mathbf {x}}_C } \) we must only know the estimate \(\hat {\sigma }^2\). SLR provides covariance matrices obtained by loose constraints, which give a non-minimal constrained solution, with a covariance factor matrix of the general form \({\mathbf {Q}}_{\hat {\mathbf {x}}} = (\mathbf {N} + {\mathbf {P}}_x )^{ - 1}\). The usual choice is Px = k2I with k2 very small but sufficient to guarantee inversion; hence the term loose, although no constraint has been applied. The approach can be theorized as a Bayesian approach with prior information on the parameters equivalent to the addition of pseudo-observations 0 = x + ex, with \({\mathbf {e}}_{\mathbf {x}} \sim (\mathbf {0}, \sigma ^2{\mathbf {P}}_{\mathbf {x}}^{ - 1} )\). Even in this case the normal equations matrix can be recovered as \(\mathbf {N} = {\mathbf {Q}}_{\hat {\mathbf {x}}}^{ - 1} - {\mathbf {P}}_x \) from \(\hat {\mathbf {C}}_{\hat {\mathbf {x}}_C } = \hat {\sigma }^2{\mathbf {Q}}_{\hat {\mathbf {x}}_C } \) if one knows \(\hat {\sigma }^2\) and Px (just k2 for the usual choice Px = k2I). The important difference is that normal equations and covariance matrices obtained by minimal constraints contain no information about the characteristics of the reference system that is not present in the available observations. On the contrary, covariance matrices of full rank, obtained in other ways, contain complete reference system information and transformation parameters must be added as additional unknowns to accommodate this fact, when they are combined with reference system free data for a solution where the reference system will be finally chosen by minimal constraints. Since the availability of existing information is not a scientific question, we will assume that normal equation matrices are available, commenting when necessary on problems caused by using the inverses of full rank covariance matrices as weight matrices.

As it is well known, uncorrelated data sets can be jointly processed only if they partly depend on common parameters. Data from different space techniques share no common station-related parameters, in this case station initial coordinates and velocities, because they simply do not share common stations. Indeed, it is impossible for a VLBI antenna, a laser system, a DORIS beacon and a GPS antenna to be placed on exactly the same point, even if this is to happen at different observation instants. Instead instruments of the different space techniques are placed near to each other at the so-called collocation sites and local high precision surveys provide the “ties”, i.e., the data that make possible the joint treatment of the data from the four space techniques. They have the form of the vector from the center of one instrument to that of another, or their initial values and constant derivatives, when a linear shift with respect to time is detected. These vectors can be expressed as data bC = AVaV + ALaL + AGaG + ADaD + eC, depending on initial coordinates and velocities aV, aL, aG, aD of stations in the VLBI, SLR, GPS and DORIS networks, respectively. The per epoch estimates \(\hat {\mathbf {a}}_V \), \(\hat {\mathbf {a}}_L \), \(\hat {\mathbf {a}}_G \), \(\hat {\mathbf {a}}_D \) of coordinates and EOP time series are treated as pseudo-observations, depending on initial coordinates and velocities, as well as on per epoch transformation parameters, from the final (not yet defined) ITRF reference system to the reference system of each epoch within each technique. In this respect the ITRF formulation can be considered as a simultaneous stacking realized through the data from ties at the collocation sites. The question is about the proper matrices to be used as weight matrices of those pseudo-observations, especially when correctly computed covariance matrices using minimal constraints are singular and cannot be inverted to provide weight matrices as usual. For this reason, we will next examine the problem of jointly adjusting uncorrelated data sets in more detail.

For the sake of convenience we will introduce throughout this chapter a special notation for block-diagonal (BD), block-column (BC) and block-row (BR) matrices involving a repetition of similar submatrices each related to a particular space technique
$$\displaystyle \begin{aligned} \mathrm{BD}({\mathbf{M}}_T ) &\equiv \left[ \begin{array}{cccc} {{\mathbf{M}}_V } & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & {{\mathbf{M}}_S } & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & {{\mathbf{M}}_G } & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & {{\mathbf{M}}_D } \end{array} \right],\qquad \mathrm{BC}({\mathbf{M}}_T ) = \left[ \begin{array}{c} {{\mathbf{M}}_V } \\ {{\mathbf{M}}_S } \\ {{\mathbf{M}}_G } \\ {{\mathbf{M}}_D } \end{array} \right],\\ \mathrm{BR}({\mathbf{M}}_T ) &= {\begin{array}{cccc} {[{\mathbf{M}}_V } & {{\mathbf{M}}_S } & {{\mathbf{M}}_G } & {{\mathbf{M}}_D ],} \end{array}} \end{aligned} $$
(256)
where the indices stand for V = VLBI, S = SLR, G = GPS and D = DORIS. The notation will also naturally extend to the case of submatrices depending on some index k = 1, 2, …, K, i.e., BD(Mk), BC(Mk), BR(Mk).

13 Basics of Data Set Combination

13.1 Combining Uncorrelated Data Sets

The ITRF formulation is based on uncorrelated data sets from different space techniques that all depend on the temporal variation of a global geodetic network and additional parameters particular to each technique. These data can be combined at a higher level, where primary data from all techniques are jointly analyzed with a prohibitive computational cost, or at a lower level, where already estimated coordinate and EOP time series are combined to produce initial coordinates, velocities and final EOPs in relation to a linear-in-time coordinate model. We will examine here in a more general set up, the possible approaches to the combination of uncorrelated data sharing common parameters.

Consider the observation equations of two uncorrelated data sets
$$\displaystyle \begin{aligned} {\mathbf{b}}_1 = {\mathbf{A}}_{1x} \mathbf{x} + {\mathbf{A}}_{1y} \mathbf{y} + {\mathbf{e}}_1 ,\qquad {\mathbf{b}}_2 = {\mathbf{A}}_{2x} \mathbf{x} + {\mathbf{A}}_{2z} \mathbf{z} + {\mathbf{e}}_2 , \end{aligned} $$
(257)
where E{e1} = 0, \(E\{{\mathbf {e}}_1 {\mathbf {e}}_1^T\} = \sigma ^2{\mathbf {P}}_1^{ - 1} \), E{e2} = 0, \(E\{{\mathbf {e}}_2 {\mathbf {e}}_2^T\} = \sigma ^2{\mathbf {P}}_2^{ - 1} \) and \(E\{{\mathbf {e}}_1 {\mathbf {e}}_2^T\} = \mathbf {0}\), which are connected only through their common parameters x. The normal equations for the joint adjustment are
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{N}}_{1,x} + {\mathbf{N}}_{2,x} } & {{\mathbf{N}}_{1,xy} } & {{\mathbf{N}}_{2,xz} } \\ {{\mathbf{N}}_{1,xy}^T } & {{\mathbf{N}}_{1,y} } & \mathbf{0} \\ {{\mathbf{N}}_{1,xz}^T } & \mathbf{0} & {{\mathbf{N}}_{2,z} } \end{array} \right]\left[ \begin{array}{c} \hat{\mathbf{x}} \\ \hat{\mathbf{y}} \\ \hat{\mathbf{z}} \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{1x} + {\mathbf{u}}_{2x} } \\ {{\mathbf{u}}_{1y} } \\ {{\mathbf{u}}_{2z} } \end{array} \right], \end{aligned} $$
(258)
where \({\mathbf {N}}_{1,x} = {\mathbf {A}}_{1x}^T {\mathbf {P}}_1 {\mathbf {A}}_{1x} \), \({\mathbf {N}}_{1,y} = {\mathbf {A}}_{1y}^T {\mathbf {P}}_1 {\mathbf {A}}_{1y} \), \({\mathbf {N}}_{1,xy} = {\mathbf {A}}_{1x}^T {\mathbf {P}}_1 {\mathbf {A}}_{1y} \), \({\mathbf {u}}_{1x} = {\mathbf {A}}_{1x}^T {\mathbf {P}}_1 {\mathbf {b}}_1 \), \({\mathbf {u}}_{1y} = {\mathbf {A}}_{1y}^T {\mathbf {P}}_1 {\mathbf {b}}_1 \) and \({\mathbf {N}}_{2,x} = {\mathbf {A}}_{2x}^T {\mathbf {P}}_2 {\mathbf {A}}_{2x} \), \({\mathbf {N}}_{2,z} = {\mathbf {A}}_{2z}^T {\mathbf {P}}_2 {\mathbf {A}}_{2z} \), \({\mathbf {N}}_{2,xz} = {\mathbf {A}}_{2x}^T {\mathbf {P}}_2 {\mathbf {A}}_{2z} \), \({\mathbf {u}}_{2x} = {\mathbf {A}}_{2x}^T {\mathbf {P}}_2 {\mathbf {b}}_2 \), \({\mathbf {u}}_{2z} = {\mathbf {A}}_{2z}^T {\mathbf {P}}_2 {\mathbf {b}}_2 \). We are interested, in particular, for the case where the coefficient matrix in the above normal equations is singular due to the lack of reference system definition which must be introduced with the help of minimal constraints.
The normal equations for the joint adjustment can be viewed as the result of an addition
$$\displaystyle \begin{aligned} \left( {\left[ \begin{array}{ccc} {{\mathbf{N}}_{1,x} } & {{\mathbf{N}}_{1,xy} } & \mathbf{0} \\ {{\mathbf{N}}_{1,xy}^T } & {{\mathbf{N}}_{1,y} } & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} \end{array} \right] + \left[ \begin{array}{ccc} {{\mathbf{N}}_{2,x} } & \mathbf{0} & {{\mathbf{N}}_{2,xz} } \\ \mathbf{0} & \mathbf{0} & \mathbf{0} \\ {{\mathbf{N}}_{2,xz}^T } & \mathbf{0} & {{\mathbf{N}}_{2,z} } \end{array} \right]} \right)\left[ \begin{array}{c} \hat{\mathbf{x}} \\ \hat{\mathbf{y}} \\ \hat{\mathbf{z}} \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{1x} } \\ {{\mathbf{u}}_{1y} } \\ \mathbf{0} \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{u}}_{2x} } \\ \mathbf{0} \\ {{\mathbf{u}}_{2z} } \end{array} \right], \end{aligned} $$
(259)
after inflation (i.e., insertion of zero rows and columns for missing unknowns) of the normal equations
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{N}}_{1,x} } & {{\mathbf{N}}_{1,xy} } \\ {{\mathbf{N}}_{1,xy}^T } & {{\mathbf{N}}_{1,y} } \end{array} \right]\left[ \begin{array}{c} {\hat{\mathbf{x}}_1 } \\ {\hat{\mathbf{y}}_1 } \end{array} \right] & = \left[ \begin{array}{c} {{\mathbf{N}}_{1,x} \hat{\mathbf{x}}_1 + {\mathbf{N}}_{1,xy} \hat{\mathbf{y}}_1 } \\ {{\mathbf{N}}_{1,xy}^T \hat{\mathbf{x}}_1 + {\mathbf{N}}_{1,y} \hat{\mathbf{y}}_1 } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{1x} } \\ {{\mathbf{u}}_{1y} } \end{array} \right], \end{aligned} $$
(260)
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{N}}_{2,x} } & {{\mathbf{N}}_{2,xz} } \\ {{\mathbf{N}}_{2,xz}^T } & {{\mathbf{N}}_{2,z} } \end{array} \right]\left[ \begin{array}{c} {\hat{\mathbf{x}}_2 } \\ {\hat{\mathbf{z}}_2 } \end{array} \right] & = \left[ \begin{array}{c} {{\mathbf{N}}_{2,x} \hat{\mathbf{x}}_2 + {\mathbf{N}}_{2,xz} \hat{\mathbf{z}}_2 } \\ {{\mathbf{N}}_{2,xz}^T \hat{\mathbf{x}}_2 + {\mathbf{N}}_{2,z} \hat{\mathbf{z}}_2 } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{2x} } \\ {{\mathbf{u}}_{2z} } \end{array} \right], \end{aligned} $$
(261)
obtained when adjusting the two data separately. The separate solutions \(\hat {\mathbf {x}}_1 \), \(\hat {\mathbf {y}}_1 \) and \(\hat {\mathbf {x}}_2 \), \(\hat {\mathbf {z}}_2 \) can be obtained with the inclusion of minimal constraints, which are in general different for the two data sets.
The joint solution is given by applying Eqs. (78) and (79) or (85) and (87) and requires the introduction of minimal constraints. Since we are primary interested in the common parameters x, with the non-common parameters y, z being of secondary interest or even nuisance parameters, we will consider only the case where the minimal constraints \({\mathbf {C}}_x^T \mathbf {x} = \mathbf {d}\), involve only the common parameters. In this case the solution for the parameter estimates and their covariance factor matrices can be expressed in terms of the following sequential algorithm:
$$\displaystyle \begin{aligned} &{} \bar{\mathbf{N}}_{1,x} = {\mathbf{N}}_{1,x} - {\mathbf{N}}_{1,xy} {\mathbf{N}}_{1,y}^{ - 1} {\mathbf{N}}_{1,xy}^T ,\qquad \bar{\mathbf{N}}_{2,x} = {\mathbf{N}}_{2,x} - {\mathbf{N}}_{2,xz} {\mathbf{N}}_{2,z}^{ - 1} {\mathbf{N}}_{2,xz}^T , \end{aligned} $$
(262)
$$\displaystyle \begin{aligned} &{} \bar{\mathbf{u}}_{1x} = {\mathbf{u}}_{1x} - {\mathbf{N}}_{1,xy} \bar{\mathbf{N}}_{1,y}^{ - 1} {\mathbf{u}}_{1y} ,\quad \ \ \ \qquad \bar{\mathbf{u}}_{2x} = {\mathbf{u}}_{2x} - {\mathbf{N}}_{2,xz} \bar{\mathbf{N}}_{2,z}^{ - 1} {\mathbf{u}}_{2z} , \end{aligned} $$
(263)
$$\displaystyle \begin{aligned} &{} \bar{\mathbf{N}}_x = \bar{\mathbf{N}}_{1,x} + \bar{\mathbf{N}}_{2,x} ,\ \quad \qquad \qquad \qquad \bar{\mathbf{u}}_x = \bar{\mathbf{u}}_{1x} + \bar{\mathbf{u}}_{2x} , \end{aligned} $$
(264)
$$\displaystyle \begin{aligned} &{} \hat{\mathbf{x}} = (\bar{\mathbf{N}}_x + {\mathbf{C}}_x {\mathbf{C}}_x^T )^{ - 1}(\bar{\mathbf{u}}_x + {\mathbf{C}}_x \mathbf{d}) = (\bar{\mathbf{N}}_x + {\mathbf{C}}_x {\mathbf{C}}_x^T )^{ - 1}\bar{\mathbf{u}}_x + {\mathbf{E}}_x ({\mathbf{C}}_x^T {\mathbf{E}}_x )^{ - 1}\mathbf{d}, \end{aligned} $$
(265)
$$\displaystyle \begin{aligned} &{} {\mathbf{Q}}_{\hat{\mathbf{x}}} = (\bar{\mathbf{N}}_{1,x} + \bar{\mathbf{N}}_{2,x} + {\mathbf{C}}_x {\mathbf{C}}_x^T )^{ - 1} - {\mathbf{E}}_x ({\mathbf{E}}_x^T {\mathbf{C}}_x {\mathbf{C}}_x^T {\mathbf{E}}_x )^{ - 1}{\mathbf{E}}_x^T , \end{aligned} $$
(266)
$$\displaystyle \begin{aligned} &{} \hat{\mathbf{y}} = \bar{\mathbf{N}}_{1,y}^{-1} ({\mathbf{u}}_{1y} - {\mathbf{N}}_{1,xy}^T \hat{\mathbf{x}}),\qquad \qquad \quad \hat{\mathbf{z}} = {\mathbf{N}}_{2,z}^{ - 1} ({\mathbf{u}}_{2z} - {\mathbf{N}}_{2,xz}^T \hat{\mathbf{x}}), \end{aligned} $$
(267)
$$\displaystyle \begin{aligned} &{} {\mathbf{Q}}_{\hat{\mathbf{y}}} = {\mathbf{N}}_{1,y}^{ - 1} - {\mathbf{N}}_{1,y}^{ - 1} {\mathbf{N}}_{1,xy}^T {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}} ,\ \ \qquad \quad {\mathbf{Q}}_{\hat{\mathbf{z}}} = {\mathbf{N}}_{2,z}^{ - 1} - {\mathbf{N}}_{2,z}^{ - 1} {\mathbf{N}}_{2,xz}^T {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}} , \end{aligned} $$
(269)
$$\displaystyle \begin{aligned} &{} {\mathbf{Q}}_{\hat{\mathbf{y}}\hat{\mathbf{z}}} = - {\mathbf{N}}_{1,y}^{ - 1} {\mathbf{N}}_{1,xy}^T {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}} . \end{aligned} $$
(270)
We have assumed that the rank deficiencies in both sets are not related with the non-common parameters but only with the common ones. Thus the rank deficiencies are associated with the design submatrices A1x, A2x and the columns of both A1y, A2z are linearly independent, in which case the submatrices \({\mathbf {N}}_{1,y} = {\mathbf {A}}_{1y}^T {\mathbf {P}}_1 {\mathbf {A}}_{1y} \) and \({\mathbf {N}}_{2,z} = {\mathbf {A}}_{2z}^T {\mathbf {P}}_2 {\mathbf {A}}_{2z} \) are invertible. This restriction has no consequence in the geodetic applications where the common parameters x are associated with station coordinates and the rank deficiencies are due to the lack of reference system definition.
The solutions of each of the separate normal equations based only on one of the two data sets, with generally different minimal constraints Open image in new window and Open image in new window, have the sequential form
$$\displaystyle \begin{aligned} &{} \hat{\mathbf{x}}_1 = (\bar{\mathbf{N}}_{1,x} + {\mathbf{C}}_{1x} {\mathbf{C}}_{1x}^T )^{ - 1}(\bar{\mathbf{u}}_{1x} + {\mathbf{C}}_{1x} {\mathbf{d}}_1 ) = (\bar{\mathbf{N}}_{1,x} + {\mathbf{C}}_{1x} {\mathbf{C}}_{1x}^T )^{ - 1}\bar{\mathbf{u}}_{1x}+\\ &\qquad + {\mathbf{E}}_x ({\mathbf{C}}_{1x}^T {\mathbf{E}}_x )^{ - 1}{\mathbf{d}}_1 , \end{aligned} $$
(271)
$$\displaystyle \begin{aligned} &{} {\mathbf{Q}}_{1,x} = (\bar{\mathbf{N}}_{1,x} + {\mathbf{C}}_{1x} {\mathbf{C}}_{1x}^T )^{ - 1} - {\mathbf{E}}_x ({\mathbf{E}}_x^T {\mathbf{C}}_x {\mathbf{C}}_x^T {\mathbf{E}}_x )^{ - 1}{\mathbf{E}}_x^T , \end{aligned} $$
(272)
$$\displaystyle \begin{aligned} &{} \hat{\mathbf{y}}_1 = {\mathbf{N}}_{1,y}^{ - 1} ({\mathbf{u}}_{1y} - {\mathbf{N}}_{1,xy}^T \hat{\mathbf{x}}_1 ), \end{aligned} $$
(273)
$$\displaystyle \begin{aligned} &{} {\mathbf{Q}}_{1,xy} = - {\mathbf{Q}}_{1,x} {\mathbf{N}}_{1,xy} {\mathbf{N}}_{1,y}^{ - 1} ,\quad {\mathbf{Q}}_{1,y} = {\mathbf{N}}_{1,y}^{ - 1} - {\mathbf{N}}_{1,y}^{ - 1} {\mathbf{N}}_{1,xy}^T {\mathbf{Q}}_{1,xy} , \end{aligned} $$
(274)
$$\displaystyle \begin{aligned} &{} \hat{\mathbf{x}}_2 = (\bar{\mathbf{N}}_{2,x} + {\mathbf{C}}_{2x} {\mathbf{C}}_{2x}^T )^{ - 1}(\bar{\mathbf{u}}_{2x} + {\mathbf{C}}_{2x} {\mathbf{d}}_2 )=\\ &\quad = (\bar{\mathbf{N}}_{2,x} + {\mathbf{C}}_{2x} {\mathbf{C}}_{2x}^T )^{ - 1}\bar{\mathbf{u}}_{2x} + {\mathbf{E}}_{x} ({\mathbf{C}}_{2x}^T {\mathbf{E}}_x )^{ - 1}{\mathbf{d}}_2, \end{aligned} $$
(275)
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{2,x} = (\bar{\mathbf{N}}_{2,x} + {\mathbf{C}}_{2,x} {\mathbf{C}}_{2,x}^T )^{ - 1}-{\mathbf{E}}_x ({\mathbf{E}}_x^T {\mathbf{C}}_x {\mathbf{C}}_x^T {\mathbf{E}}_x )^{ - 1}{\mathbf{E}}_x^T , \end{aligned} $$
(276)
$$\displaystyle \begin{aligned} &{} \hat{\mathbf{z}}_2 = {\mathbf{N}}_{2,z}^{ - 1} ({\mathbf{u}}_{2z} - {\mathbf{N}}_{2,xz}^T \hat{\mathbf{x}}_2 ), \end{aligned} $$
(277)
$$\displaystyle \begin{aligned} &{} {\mathbf{Q}}_{2,xz} = - {\mathbf{Q}}_{2,x} {\mathbf{N}}_{2,xz} {\mathbf{N}}_{2,z}^{ - 1} ,\quad {\mathbf{Q}}_{2,z} = {\mathbf{N}}_{2,z}^{ - 1} - {\mathbf{N}}_{2,xz}^{ - 1} {\mathbf{N}}_{2,xz}^{ - 1} . \end{aligned} $$
(278)
If we eliminate the non-common parameters from the joint normal equations (258) we arrive at the reduced joint normal equations
$$\displaystyle \begin{aligned} \bar{\mathbf{N}}_{x}\hat{\mathbf{x}} = (\bar{\mathbf{N}}_{1,x} + \bar{\mathbf{N}}_{2,x} )\hat{\mathbf{x}} = \bar{\mathbf{u}}_{1x} + \bar{\mathbf{u}}_{2x} = \bar{\mathbf{u}}_{x}. \end{aligned} $$
(279)
The joint solution (265) for \(\hat {\mathbf {x}}\) satisfying minimal constraints \({\mathbf {C}}_{1x}^T \mathbf {x} = {\mathbf {d}}_1 \), the separate solution \(\hat {\mathbf {x}}_1 \) from the first data set (271) satisfying minimal constraints \({\mathbf {C}}_{1x}^T \mathbf {x} = {\mathbf {d}}_1 \) and the separate solution \(\hat {\mathbf {x}}_2 \) from the second data set (275) satisfying minimal constraints \({\mathbf {C}}_{2x}^T \mathbf {x} = {\mathbf {d}}_2 \), they all have the same form as if combining the reduced normal equations of each case with the corresponding minimal constraints, as it holds for original unreduced normal equations, provided that the incorporated minimal constraints involve only the common parameters that remain after the reduction.
If we eliminate the non-common parameters from each of the separate normal equations (260) and (261) we arrive we arrive at the reduced separate normal equations
$$\displaystyle \begin{aligned} \bar{\mathbf{N}}_{1,x} \hat{\mathbf{x}}_1 = \bar{\mathbf{u}}_{1x} ,\qquad \bar{\mathbf{N}}_{2,x} \hat{\mathbf{x}}_{2} = \bar{\mathbf{u}}_{2x} . \end{aligned} $$
(280)
Addition of the above reduced normal equations reproduces Eq. (279) derived by eliminating the non-common parameters from the joint normal equations. Thus we come to the following conclusion:

Proposition 3

Elimination of the non-common parameters from the joint normal equations of two uncorrelated data sets is equivalent to the formulation of the separate normal equations for each set, separate elimination of the non-common parameters and addition of the reduced separate normal equations.

Thus addition of the normal equations from the two sets followed by elimination of the non-common parameters, is equivalent to elimination of the non-common parameters from the two sets followed by addition of the reduced normal equations.

We turn next to the usual procedure of using separately obtained estimates from uncorrelated data sets, as pseudo-observations for the estimation of parameters in a combination step. Ideally, we would like to recover the same solution as if two or more uncorrelated data sets with common parameters were adjusted simultaneously. The critical question in this case is which weight matrices to use in the least squares solution. In case that there is no rank defect, the answer is trivial: we just use the inverses of the covariance factor matrices. In rank deficient models, the covariance matrices are singular and cannot be inverted. One can resort to Rao’s unified estimation theory (see e.g., [40, 54, 55]) which takes into account singular covariance matrices for the observations. Fortunately, it will turn out that we can avoid this complication. First, we will utilize the left sides of the separate normal equations (260) and (261) in the joint normal equations (258), in order to express them in terms of the separate estimates
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{N}}_{1,x} + {\mathbf{N}}_{2,x} } & {{\mathbf{N}}_{1,xy} } & {{\mathbf{N}}_{2,xz} } \\ {{\mathbf{N}}_{1,xy}^T } & {{\mathbf{N}}_{1,y} } & \mathbf{0} \\ {{\mathbf{N}}_{2,xz}^T } & \mathbf{0} & {{\mathbf{N}}_{2,z} } \end{array} \right]\left[ \begin{array}{c} \hat{\mathbf{x}} \\ \hat{\mathbf{y}} \\ \hat{\mathbf{z}} \end{array} \right] = \left[ \begin{array}{c} {({\mathbf{N}}_{1,x} \hat{\mathbf{x}}_1 + {\mathbf{N}}_{1,xy} \hat{\mathbf{y}}_2 ) + ({\mathbf{N}}_{2,x} \hat{\mathbf{x}}_2 + {\mathbf{N}}_{2,xz} \hat{\mathbf{z}}_2 )} \\ {{\mathbf{N}}_{1,xy}^T \hat{\mathbf{x}}_1 + {\mathbf{N}}_{1,y} \hat{\mathbf{y}}_2 } \\ {{\mathbf{N}}_{2,xz}^T \hat{\mathbf{x}}_2 + {\mathbf{N}}_{2,z} \hat{\mathbf{z}}_2 } \end{array} \right]. \end{aligned} $$
(281)
If the estimates of the separate solutions are used as pseudo-observations of the respective unknowns
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\hat{\mathbf{x}}_2 } \\ {\hat{\mathbf{y}}_2 } \end{array} \right] = \left[ \begin{array}{c} \mathbf{x} \\ \mathbf{y} \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{e}}_{\hat{\mathbf{x}}_1 } } \\ {{\mathbf{e}}_{\hat{\mathbf{y}}_2 } } \end{array} \right],\qquad \left[ \begin{array}{c} {\hat{\mathbf{x}}_2 } \\ {\hat{\mathbf{z}}_2 } \end{array} \right] = \left[ \begin{array}{c} \mathbf{x} \\ \mathbf{z} \end{array} \right] = \left[ \begin{array}{c} {\mathbf{e}}_{\hat{\mathbf{x}}_{{}_2}} \\ {{\mathbf{e}}_{\hat{\mathbf{z}}2}} \end{array} \right], \end{aligned} $$
(282)
with yet unspecified weight matrices \(\left [ \begin {array}{cc} {{\mathbf {P}}_{1,x} } & {{\mathbf {P}}_{1,xy} } \\ {{\mathbf {P}}_{1,xy}^T } & {{\mathbf {P}}_{1,y} } \end {array} \right ]\), \(\left [ \begin {array}{cc} {{\mathbf {P}}_{2,x} } & {{\mathbf {P}}_{2,xz} } \\ {{\mathbf {P}}_{2,xz}^T } & {{\mathbf {P}}_{2,z} } \end {array} \right ]\), respectively, we will arrive at the normal equations
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{P}}_{1,x} + {\mathbf{P}}_{2,x} } & {{\mathbf{P}}_{1,xy} } & {{\mathbf{P}}_{2,xz} } \\ {{\mathbf{P}}_{1,xy}^T } & {{\mathbf{P}}_{1,y} } & \mathbf{0} \\ {{\mathbf{P}}_{2,xz}^T } & \mathbf{0} & {{\mathbf{P}}_{2,z} } \end{array} \right]\left[ \begin{array}{c} \hat{\mathbf{x}} \\ \hat{\mathbf{y}} \\ \hat{\mathbf{z}} \end{array} \right] = \left[ \begin{array}{c} {({\mathbf{P}}_{1,x} \hat{\mathbf{x}}_1 + {\mathbf{P}}_{1,xy} \hat{\mathbf{y}}_2 ) + ({\mathbf{P}}_{2,x} \hat{\mathbf{x}}_2 + {\mathbf{P}}_{2,x} \hat{\mathbf{z}}_2 )} \\ {{\mathbf{P}}_{1,xy}^T \hat{\mathbf{x}}_1 + {\mathbf{P}}_{1,y} \hat{\mathbf{y}}_2 } \\ {{\mathbf{P}}_{2,xz}^T \hat{\mathbf{x}}_2 + {\mathbf{P}}_{2,z} \hat{\mathbf{z}}_2 } \end{array} \right]. \end{aligned} $$
(283)
Comparison with the directly obtained joint solution in the form (281) shows that the last can be obtained using the estimates from the separate solutions, provided that the normal equations are used as weight matrices
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{P}}_{1,x} } & {{\mathbf{P}}_{1,xy} } \\ {{\mathbf{P}}_{1,xy}^T } & {{\mathbf{P}}_{1,y} } \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{N}}_{1,x} } & {{\mathbf{N}}_{1,xy} } \\ {{\mathbf{N}}_{1,xy}^T } & {{\mathbf{N}}_{1,y} } \end{array} \right],\qquad \left[ \begin{array}{cc} {{\mathbf{P}}_{2,x} } & {{\mathbf{P}}_{2,xz} } \\ {{\mathbf{P}}_{2,xz}^T } & {{\mathbf{P}}_{2,z} } \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{N}}_{2,x} } & {{\mathbf{N}}_{2,xz} } \\ {{\mathbf{N}}_{2,xz}^T } & {{\mathbf{N}}_{2,z} } \end{array} \right]. \end{aligned} $$
(284)
Thus we have arrived at the following conclusion:

Proposition 4

If the parameter estimates obtained from the separate adjustments of uncorrelated data sets are jointly used as pseudo-observations of the corresponding unknown parameters, with weight matrices identical to the coefficient matrices of the separate adjustment normal equations, then the obtained normal equations are the same as when all data sets are jointly adjusted.

There is a point that needs to be stressed with respect to the above proposition. It has been implicitly assumed that the same unknown variance factor σ2 applies to all data sets \({\mathbf {b}}_i = {\mathbf {A}}_{i,\mathbf {x}} \mathbf {x} + {\mathbf {A}}_{i,{\mathbf {y}}_i } {\mathbf {y}}_i + {\mathbf {e}}_i \), \({\mathbf {e}}_i \sim (\mathbf {0}, \sigma ^2{\mathbf {P}}_i^{ - 1} )\), i = 1, …, q. Thus we cannot cover a model with variance component estimation, where each data set \({\mathbf {b}}_i = {\mathbf {A}}_{i,\mathbf {x}} \mathbf {x} + {\mathbf {A}}_{i,{\mathbf {y}}_i } {\mathbf {y}}_i + {\mathbf {e}}_i \), i = 1, …, q, has a different unknown variance factor \(\sigma _i^2 \), i.e., \({\mathbf {e}}_i \sim (\mathbf {0}, \sigma _i^2 {\mathbf {P}}_i^{ - 1} )\). In geodesy we know the covariance matrix Ci of each data set and we may directly use weight matrices \({\mathbf {P}}_i = {\mathbf {C}}_i^{ - 1} \), as e.g., suggested by W. Baarda and the Delft school. Nevertheless separate estimates \(\hat {\sigma }_i^2 = \hat {\mathbf {e}}_i^T {\mathbf {P}}_i \hat {\mathbf {e}}_i / f_i \) are used in most cases in order to check the statistical hypotheses \(\sigma _i^2 = 1\), as a means of checking the validity of the model rather than the existence of a missed non-unity factor in the data covariances. A large value of \(\hat {\sigma }_i^2 \), reflects either insufficient modeling (ignored effects in the deterministic part of the model), or existence of ignored correlations in the observations, or presence of systematic errors, i.e., violation of the zero mean hypothesis E{ei} = 0. It is though customary to scale the obtained covariance factor matrix \({\mathbf {Q}}_{\hat {\mathbf {x}}_{(i)}} \), which theoretically is the covariance matrix of the separate parameter estimates \(\hat {\mathbf {x}}_{(i)}\) under the choice \({\mathbf {P}}_i = {\mathbf {C}}_i^{ - 1} \), by the additional inflation factor \(\hat {\sigma }_i^2 \) to obtain a covariance matrix estimates \(\hat {\mathbf {C}}_{\hat {\mathbf {x}}_{(i)}} = \hat {\sigma }_i^2 {\mathbf {Q}}_{\hat {\mathbf {x}}_{(i)}} \), in an attempt to partially account for the use of an incorrect model, without any real theoretical foundation for this choice. As we have already mentioned and we will also repeat in the forthcoming, statistical optimality, in the sense of following the correct theoretical procedures, is not a panacea when the basic Gauss-Markov model (known deterministic model, zero mean errors, error covariance matrix known except of a multiplicative factor) is not consistent with the data. It may happen that sub-optimal procedures prove to be more robust with respect to deterministic and stochastic model errors and thus produce better (closer to reality) estimates. In the special case where there are no defects associated with the definition of the reference system, the resulting normal equations and covariance factor matrices are non-singular. Then using the estimates \(\hat {\mathbf {C}}_{\hat {\mathbf {x}}_{(i)} } = \hat {\sigma }_i^2 {\mathbf {Q}}_{\hat {\mathbf {x}}_{(i)} } \) and corresponding weight matrices \({\mathbf {P}}_{\hat {\mathbf {x}}_{(i)}} = \hat {\mathbf {C}}_{\hat {\mathbf {x}}_{(i)}}^{ - 1} = \hat {\sigma }_i^{ - 2} {\mathbf {Q}}_{\hat {\mathbf {x}}_{(i)}}^{-1} \), in the combination step, accounts for an ad hoc method of variance component estimation.

Let us now turn to the case where the non-common parameters are nuisance parameters and we are only interested in the estimates of the common parameters. In this case we can use the separate estimates \(\hat {\mathbf {x}}_1 \), \(\hat {\mathbf {x}}_2 \) as pseudo-observations
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\hat{\mathbf{x}}_1 } \\ {\hat{\mathbf{x}}_2 } \end{array} \right] = \left[ \begin{array}{c} \mathbf{x} \\ \mathbf{x} \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{e}}_{\hat{\mathbf{x}}_1 } } \\ {{\mathbf{e}}_{\hat{\mathbf{x}}_2 } } \end{array} \right] = \left[ \begin{array}{c} \mathbf{I} \\ \mathbf{I} \end{array} \right]\mathbf{x} + \left[ \begin{array}{c} {{\mathbf{e}}_{\hat{\mathbf{x}}_1 } } \\ {{\mathbf{e}}_{\hat{\mathbf{x}}_2 } } \end{array} \right], \end{aligned} $$
(285)
with general weight matrices \({\mathbf {P}}_{\hat {\mathbf {x}}_1 } \) and \({\mathbf {P}}_{\hat {\mathbf {x}}_2 } \). The least squares solution leads to the combined normal equations
$$\displaystyle \begin{aligned} ({\mathbf{P}}_{\hat{\mathbf{x}}_1 } + {\mathbf{P}}_{\hat{\mathbf{x}}_2 } )\hat{\mathbf{x}} = {\mathbf{P}}_{\hat{\mathbf{x}}_1 } \hat{\mathbf{x}}_1 + {\mathbf{P}}_{\hat{\mathbf{x}}_2 } \hat{\mathbf{x}}_2 . \end{aligned} $$
(286)
Comparison with the reduced normal equations (279) combined with (280) shows that we can recover the optimal joint estimates if we use the reduced normal equations matrices of the separate solutions as weight matrices, \({\mathbf {P}}_{\hat {\mathbf {x}}_1 } = \bar {\mathbf {N}}_{1,x}\), \({\mathbf {P}}_{\hat {\mathbf {x}}_2 } = \bar {\mathbf {N}}_{2,x}\). Thus we have arrived at the conclusion:

Proposition 5

The optimal estimates of the common parameters obtained from the joint adjustments of uncorrelated data sets, can be also obtained in the follow way: Form the separate normal equations of each data set and obtain the reduced normal equation coefficient matrices by eliminating the non-common parameters; determine the separate solutions with the help of appropriate minimal constraints; use the separate estimates as pseudo-observations of the common unknowns, with weight matrices the reduced coefficient matrices of the normal equations, and determine the deriving solution.

In plain words, the choice of reference system present in the separate coordinate estimates is “killed” by the multiplication with their weight matrices coming from the separate normal equations. Thus the joint normal equations have no reference system information, and this must be introduced through a choice of minimal constraints.

Note that if one uses instead the unreduced submatrices N1,x, N2,x of the separate normal equations as weight matrices it arrives at normal equations \(({\mathbf {N}}_{1,x} + {\mathbf {N}}_{2,x} )\hat {\mathbf {x}}_R = {\mathbf {N}}_{1,x} \hat {\mathbf {x}}_1 + {\mathbf {N}}_{2,x} \hat {\mathbf {x}}_2\), which obtain a suboptimal set of solutions \(\hat {\mathbf {x}}_R \), different from the set of the optimal ones \(\hat {\mathbf {x}}\).

This means e.g., that in theory, one should not rely on the normal sub-matrices of the per epoch coordinates and EOPS from SLR observations but the complete normal equation matrix involving also spherical harmonics, which must be reduced in order to obtain the proper weight matrix in the ITRF formulation. Furthermore, the gravity field parameter estimates should be updated, on the basis of the per-epoch coordinates and EOP estimates provided by the final ITRF solution. The question is whether this theoretically “correct” approach should be followed in place of the actually used restricted approach. The answer is yes, provided that the deterministic and stochastic model used in SLR data analysis is consistent with reality. Again, we face the question of the relevance of formal statistical optimality in the case of doubts about the validity of the model upon which the optimal procedure has been built. The sub-optimal restricted solution may prove more robust, by not allowing incorrect information from the gravity field estimation part to flow into the procedure of ITRF parameter estimation.

13.2 Non-adjustable Observations in Uncorrelated Data Sets

We will turn next to particular sets of observational data which we will call non-adjustable observations. Let us suppose that in the second set of data b2 = A2xx + A2zz + e2 the submatrix A2z is square and nonsingular. As one searches for the values of the parameters, it is obvious that for whatever the value of the estimate \(\hat {\mathbf {x}}\), there is a corresponding value \(\hat {\mathbf {z}} = {\mathbf {A}}_{2z}^{ - 1} ({\mathbf {b}}_2 - {\mathbf {A}}_{2x} \hat {\mathbf {x}})\) which gives error estimates
$$\displaystyle \begin{aligned} \hat{\mathbf{e}}_2 = {\mathbf{b}}_2 - {\mathbf{A}}_{2x} \hat{\mathbf{x}} - {\mathbf{A}}_{2z} \hat{\mathbf{z}} = {\mathbf{b}}_2 - {\mathbf{A}}_{2x} \hat{\mathbf{x}} - {\mathbf{A}}_{2z} {\mathbf{A}}_{2z}^{ - 1} ({\mathbf{b}}_2 - {\mathbf{A}}_{2x} \hat{\mathbf{x}}) = \mathbf{0}, \end{aligned} $$
(287)
and thus has no contribution to the least squares sum under minimization. In such a situation we say that the observations b2 are non-adjustable with respect to the parameters \(\hat {\mathbf {z}}\). To see what exactly is happening in this case we form the joint equations for the two sets, which explicitly are
$$\displaystyle \begin{gathered} {} ({\mathbf{N}}_{1,x} + {\mathbf{N}}_{2,x} )\hat{\mathbf{x}} + {\mathbf{N}}_{1,xy} \hat{\mathbf{y}} + {\mathbf{A}}_{2x}^T {\mathbf{P}}_2 {\mathbf{A}}_{2z} \hat{\mathbf{z}} = {\mathbf{u}}_{1x} + {\mathbf{u}}_{2x} , \end{gathered} $$
(288)
$$\displaystyle \begin{gathered} {} {\mathbf{N}}_{1,xy}^T \hat{\mathbf{x}} + {\mathbf{N}}_{1,y} \hat{\mathbf{y}} = {\mathbf{u}}_{1y} , \end{gathered} $$
(289)
$$\displaystyle \begin{gathered} {} {\mathbf{A}}_{2z}^T {\mathbf{P}}_2 {\mathbf{A}}_{2x} \hat{\mathbf{x}} + {\mathbf{A}}_{2z}^T {\mathbf{P}}_2 {\mathbf{A}}_{2z} \hat{\mathbf{z}} = {\mathbf{A}}_{2z}^T {\mathbf{P}}_2 {\mathbf{b}}_2 . \end{gathered} $$
(290)
Since both \({\mathbf {A}}_{2z}^T \) and P2 are invertible (290) simplifies to \({\mathbf {A}}_{2x} \hat {\mathbf {x}} + {\mathbf {A}}_{2z} \hat {\mathbf {z}} = {\mathbf {b}}_2 \) and solved for \(\hat {\mathrm {z}}\) gives
$$\displaystyle \begin{aligned}\hat{\mathbf{z}} = {\mathbf{A}}_{2z}^{-1}({\mathbf{b}}_2 - {\mathbf{A}}_{2x}\hat{\mathbf{x}}) \end{aligned} $$
(291)
Replacing this value in the rest of the normal equations (288) and (289) these become
$$\displaystyle \begin{aligned} & {\mathbf{N}}_{1,x}\hat{\mathbf{x}} + {\mathbf{N}}_{1,xy}\hat{\mathbf{y}} = {\mathbf{u}}_{1x} \end{aligned} $$
(292)
$$\displaystyle \begin{aligned} & {\mathbf{N}}_{1,xy}^{T}\hat{\mathbf{x}} + {\mathbf{N}}_{1,y}\hat{\mathbf{y}} = {\mathbf{u}}_{1y} \end{aligned} $$
(293)
which are exactly the ones from the separate adjustment of only the first set of data b1. Solving (292) and (293) we obtain directly the jointly optimal estimates of \(\hat {\mathbf {x}}\) and \(\hat {\mathbf {y}}\), which can be used in (291) in order to compute the jointly optimal estimate of \(\hat {\mathbf {z}}\). Thus we come to the following conclusion:

Proposition 6

When one of two uncorrelated data sets is non-adjustable with respect to its non-common parameters, the adjustment of the other set provides directly the jointly optimal estimates of its parameters. The obtained estimates can be used to derive the jointly optimal estimates of the non-common parameters of the non-adjustable data set utilizing the observation model of the non-adjustable data as if it was free of errors.

The covariance factor matrices \({\mathbf {Q}}_{\hat {\mathbf {x}}}\), \({\mathbf {Q}}_{\hat {\mathbf {y}}}\), \({\mathbf {Q}}_{\hat {\mathbf {x}}\hat {\mathbf {y}}}\) are obtained from the separate adjustment of only the first set of data b1. The remaining covariance factor matrices can be computed sequentially from
$$\displaystyle \begin{aligned} \left[\begin{array}{c} {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}}\\ {\mathbf{Q}}_{\hat{\mathbf{y}}\hat{\mathbf{z}}} \end{array}\right] = \left[\begin{array}{c} -{\mathbf{Q}}_{\hat{\mathbf{x}}}{\mathbf{A}}_{2x}^{T} {\mathbf{A}}_{2z}^{-T}\\ -{\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}}^{T}{\mathbf{A}}_{2x}^{T} {\mathbf{A}}_{2z}^{-T} \end{array} \right], \qquad {\mathbf{Q}}_{\hat{\mathbf{z}}} = \left({\mathbf{A}}_{2z}^{T}{\mathbf{P}}_2{\mathbf{A}}_{2z}\right)^{-1}-{\mathbf{A}}_{2z}^{-1}{\mathbf{A}}_{2x}{\mathbf{Q}}_{\hat{\mathbf{x}}\hat{z}}. \end{aligned} $$
(294)

13.3 Non-adjustable Observations in Correlated Data Sets

So far we considered only uncorrelated data sets. We will generalize the previous case of a non-adjustable data set to the case where the two data sets are in fact correlated. The observation equations in this case are
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\mathbf{b}}_1\\ {\mathbf{b}}_2 \end{array} \right] = \left[\begin{array}{l} {\mathbf{A}}_{1x}\mathbf{x} + {\mathbf{A}}_{1y}\mathbf{y} + {\mathbf{e}}_1\\ {\mathbf{A}}_{2x}\mathbf{x} + {\mathbf{A}}_{2z}\mathbf{z} + {\mathbf{e}}_2\\ \end{array}\right] = \left[\begin{array}{ccc} {\mathbf{A}}_{1x} & {\mathbf{A}}_{1y} & \mathbf{0}\\ {\mathbf{A}}_{2x} & \mathbf{0} & {\mathbf{A}}_{2z}\\ \end{array} \right] \left[\begin{array}{c} \mathbf{x}\\ \mathbf{y}\\ \mathbf{z} \end{array} \right] + \left[\begin{array}{c} {\mathbf{e}}_1 \\ {\mathbf{e}}_2 \end{array} \right], \end{aligned} $$
(295)
with weight matrix being the inverse of the joint covariance factor matrix
$$\displaystyle \begin{aligned} \left[\begin{array}{cc} {\mathbf{P}}_{1} & {\mathbf{P}}_{12}\\ {\mathbf{P}}_{12}^{T} & {\mathbf{P}}_{2} \end{array} \right] = \left[\begin{array}{cc} {\mathbf{Q}}_{1} & {\mathbf{Q}}_{12}\\ {\mathbf{Q}}_{12}^{T} & {\mathbf{Q}}_{2} \end{array} \right]^{-1}, \end{aligned} $$
(296)
where the non-zero submatrix P12 is a result of the existing correlation (Q12 ≠ 0). We will need some relations following from the last one, namely
$$\displaystyle \begin{aligned} &{\mathbf{P}}_{2}^{-1} = {\mathbf{Q}}_2 - {\mathbf{Q}}_{12}^{T}{\mathbf{Q}}_{1}^{-1}{\mathbf{Q}}_{12}, \,\qquad {\mathbf{P}}_2^{-1}{\mathbf{P}}_{12}^{T} = - {\mathbf{Q}}_{12}^{T}{\mathbf{Q}}_{1}^{-1}\\ &{\mathbf{P}}_2^{-1}{\mathbf{P}}_{12}^{T} = - {\mathbf{Q}}_{12}^{T}{\mathbf{Q}}_{1}^{-1}, \qquad \qquad {\mathbf{P}}_1 - {\mathbf{P}}_{12}{\mathbf{P}}_{2}^{-1}{\mathbf{P}}_{12}^{T} = {\mathbf{Q}}_{1}^{-1}. \end{aligned} $$
(297)
The joint normal equations take in this case the form
$$\displaystyle \begin{aligned} \left[\begin{array}{ccc} {\mathbf{N}}_{\mathbf{x}} & {\mathbf{N}}_{\mathbf{xy}} & {\mathbf{N}}_{\mathbf{xz}}\\ {\mathbf{N}}_{\mathbf{xy}}^{T} & {\mathbf{N}}_{\mathbf{y}} & {\mathbf{N}}_{\mathbf{yz}}\\ {\mathbf{N}}_{\mathbf{xz}}^{T} & {\mathbf{N}}_{\mathbf{yz}}^{T} & {\mathbf{N}}_{\mathbf{z}} \end{array} \right] \left[\begin{array}{c} \hat{\mathbf{x}}\\ \hat{\mathbf{y}}\\ \hat{\mathbf{z}} \end{array} \right] = \left[\begin{array}{c} {\mathbf{u}}_x\\ {\mathbf{u}}_y\\ {\mathbf{u}}_z \end{array} \right], \end{aligned} $$
(298)
where
$$\displaystyle \begin{aligned} &{\mathbf{N}}_{\mathbf{x}} = {\mathbf{A}}_{1x}^{T}{\mathbf{P}}_1 {\mathbf{A}}_{1x} + {\mathbf{A}}_{2x}^{T}{\mathbf{P}}_{12}^{T} {\mathbf{A}}_{1x} + {\mathbf{A}}_{1x}^{T}{\mathbf{P}}_{12} {\mathbf{A}}_{2x} + {\mathbf{A}}_{2x}^{T}{\mathbf{P}}_2 {\mathbf{A}}_{2x}\\ &{\mathbf{N}}_{\mathbf{xy}} = {\mathbf{A}}_{1x}^{T}{\mathbf{P}}_1 {\mathbf{A}}_{1y} + {\mathbf{A}}_{2x}^{T}{\mathbf{P}}_{12}^{T} {\mathbf{A}}_{1y}, \quad {\mathbf{N}}_{\mathbf{xz}} = {\mathbf{A}}_{1x}^{T} {\mathbf{P}}_{12} {\mathbf{A}}_{2z} + {\mathbf{A}}_{2x}^{T} {\mathbf{P}}_{2}{\mathbf{A}}_{2z},\\ &{\mathbf{N}}_{\mathbf{y}} = A_{1y}^{T} {\mathbf{P}}_{1}{\mathbf{A}}_{1y}, \quad {\mathbf{N}}_{\mathbf{yz}} = {\mathbf{A}}_{1y}^{T} {\mathbf{P}}_{12}{\mathbf{A}}_{2z}, \quad {\mathbf{N}}_{\mathbf{z}} = {\mathbf{A}}_{2z}^{T}{\mathbf{P}}_2{\mathbf{A}}_{2z},{} \end{aligned} $$
(299)
$$\displaystyle \begin{aligned} &{\mathbf{u}}_{x} = {\mathbf{A}}_{1x}^{T}{\mathbf{P}}_1{\mathbf{b}}_1 + {\mathbf{A}}_{2x}^{T}{\mathbf{P}}_{12}^{T}{\mathbf{b}}_1+{\mathbf{A}}_{1x}^{T}{\mathbf{P}}_{12}{\mathbf{b}}_2 + {\mathbf{A}}_{2x}^{T}{\mathbf{P}}_2{\mathbf{b}}_2,\\ &{\mathbf{u}}_y ={\mathbf{A}}_{1y}^{T}{\mathbf{P}}_1{\mathbf{b}}_1 + {\mathbf{A}}_{1y}^{T}{\mathbf{P}}_{12}{\mathbf{b}}_2,\quad {\mathbf{u}}_z = {\mathbf{A}}_{2z}^{T}{\mathbf{P}}_{12}^{T}{\mathbf{b}}_1 + {\mathbf{A}}_{2z}^{T}{\mathbf{P}}_2{\mathbf{b}}_2.{} \end{aligned} $$
(300)
In particular, the third one of (298) has the explicit form
$$\displaystyle \begin{aligned} ({\mathbf{A}}_{2z}^{T}{\mathbf{P}}_{12}^{T}{\mathbf{A}}_{1x} + {\mathbf{A}}_{2z}^{T}{\mathbf{P}}_2{\mathbf{A}}_{2x})\hat{\mathbf{x}} + {\mathbf{A}}_{2z}^{T}{\mathbf{P}}_{12}^{T}{\mathbf{A}}_{1y}\hat{\mathbf{y}} + {\mathbf{A}}_{2z}^{T}{\mathbf{P}}_2{\mathbf{A}}_{2z}\hat{\mathbf{z}} = {\mathbf{A}}_{2z}^{T}{\mathbf{P}}_{12}^{T}{\mathbf{b}}_1 + {\mathbf{A}}_{2z}^{T}\mathbf{P}{\mathbf{b}}_2, \end{aligned} $$
(301)
In view of the fact that both \({\mathbf {A}}_{2z}^{T}\) and P are invertible, we may solve the last equation for \(\hat {\mathbf {z}}\) and take the relation \({\mathbf {P}}_{2}^{-1}{\mathbf {P}}_{12}^{T} = -{\mathbf {Q}}_{12}^{T}{\mathbf {Q}}_{1}^{-1}\) into account, to obtain
$$\displaystyle \begin{aligned} \hat{\mathbf{z}} = {\mathbf{A}}_{2z}^{-1}\left[{\mathbf{b}}_2 - {\mathbf{A}}_{2x}\hat{\mathbf{x}} - {\mathbf{Q}}_{12}^{T}{\mathbf{Q}}_{1}^{-1}({\mathbf{b}}_1 - {\mathbf{A}}_{1x}\hat{\mathbf{x}} - {\mathbf{A}}_{1y}\hat{\mathbf{y}}) \right]. \end{aligned} $$
(302)
An interesting interpretation of the above relation can be obtained if it is analyzed in three steps
$$\displaystyle \begin{aligned} \hat{\mathbf{e}}_1 {=} {\mathbf{b}}_1 - {\mathbf{A}}_{1x}\hat{\mathbf{x}} - {\mathbf{A}}_{1y}\hat{\mathbf{y}},\quad \tilde{\mathbf{e}}_2 {=} {\mathbf{Q}}_{12}^{T}{\mathbf{Q}}_1^{-1}\hat{\mathbf{e}}_1 = {\mathbf{Q}}_{21}{\mathbf{Q}}_{1}^{-1}\hat{\mathbf{e}}_1, \quad \hat{\mathbf{z}} {=} {\mathbf{A}}_{2z}^{-1}({\mathbf{b}}_2 - {\mathbf{A}}_{2x}\hat{\mathbf{x}} - \tilde{\mathbf{e}}_2). \end{aligned} $$
(303)
In the first step an estimate \(\hat {\mathbf {e}}_1\) of the errors in the first data set is estimated. In the second step this estimate is used in order to obtain a prediction \(\tilde {\mathbf {e}}_2\) of the errors in the second data set. Finally, in the third step the estimates \(\hat {\mathbf {x}},\hat {\mathbf {z}}\) and the prediction \(\tilde {\mathbf {e}}_2\) are replaced in the observation equations b2 = A2xx + A2xz + e2 and the resulting relation is solved for \(\hat {\mathbf {z}}\), in order to express it in terms of \(\hat {\mathbf {x}}\) and \(\tilde {\mathbf {e}}_2\).
If the above value of \(\hat {\mathbf {z}}\) is replaced in the first two of (298), and take into account that \({\mathbf {P}}_1-{\mathbf {P}}_{12}{\mathbf {P}}_{2}^{-1}{\mathbf {P}}_{12}^{T}={\mathbf {Q}}_{1}^{-1}\) we obtain the reduced normal equations
$$\displaystyle \begin{aligned} ({\mathbf{A}}_{1x}^{T}{\mathbf{Q}}_{1}^{-1}{\mathbf{A}}_{1x})\hat{\mathbf{x}} + ({\mathbf{A}}_{1x}^{T}{\mathbf{Q}}_{1}^{-1}{\mathbf{A}}_{1y})\hat{\mathbf{y}} = {\mathbf{A}}_{1x}^{T}{\mathbf{Q}}_{1}^{-1}{\mathbf{b}}_{1}, \end{aligned} $$
(304)
$$\displaystyle \begin{aligned} ({\mathbf{A}}_{1y}^{T}{\mathbf{Q}}_{1}^{-1}{\mathbf{A}}_{1x})\hat{\mathbf{x}} + ({\mathbf{A}}_{1y}^{T}{\mathbf{Q}}_{1}^{-1}{\mathbf{A}}_{1y})\hat{\mathbf{y}} = {\mathbf{A}}_{1y}^{T}{\mathbf{Q}}_{1}^{-1}{\mathbf{b}}_{1}. \end{aligned} $$
(305)
These are no other than the normal equations obtained from the adjustment of only the first set b1, completely ignoring the existence of the second set, and in particular its correlation with it. Thus we come to the following conclusion:

Proposition 7

When one of two correlated data sets is non-adjustable with respect to its non-common parameters, the adjustment of the other set alone provides directly the jointly optimal estimates of its parameters, provided that correlation is ignored and the weight matrix is simply the inverse of the corresponding covariance factor submatrix.

Once the estimates \(\hat {\mathbf {x}}, \hat {\mathbf {y}}\) and their covariance factor matrices \({\mathbf {Q}}_{\hat {\mathbf {x}}}\), \({\mathbf {Q}}_{\hat {\mathbf {y}}}\), \({\mathbf {Q}}_{\hat {\mathbf {x}}\hat {\mathbf {y}}}\) have been thus determined the remaining estimates \(\hat {\mathbf {z}}\) can be computed using Eq. (302). The covariance factor matrices can be obtained from the inversion of the coefficient matrix of the joint normal equations (298). It turns out that the submatrices \({\mathbf {Q}}_{\hat {\mathbf {x}}}\), \({\mathbf {Q}}_{\hat {\mathbf {y}}}\), \({\mathbf {Q}}_{\hat {\mathbf {x}}\hat {\mathbf {y}}}\) are exactly the same as computed from the adjustment of only the first data set.

In order to obtain the remaining covariance factor matrices \({\mathbf {Q}}_{\hat {\mathbf {x}}\hat {\mathbf {z}}}\), \({\mathbf {Q}}_{\hat {\mathbf {y}}\hat {\mathbf {z}}}\) and \({\mathbf {Q}}_{\hat {\mathbf {z}}}\), we will derive first a more general scheme for their sequential estimation for any set of equations having the form of Eq. (298). We will consider two cases, the case where the coefficient matrix of the normal equations is nonsingular and the singular case where the submatrix \({\mathbf {N}}_{\hat {\mathbf {z}}}\) is invertible and the rank deficiency can be removed by minimal constraints \({\mathbf {C}}_{\mathbf {x}}^{T}\mathbf {x} + {\mathbf {C}}_{\mathbf {y}}^{T}\mathbf {y} = \mathbf {d}\), which does not involve the eliminated parameters z. The first case is rather trivial because the desired covariance factor matrices follow from the analytical inversion of the normal equations coefficient matrix. To prove the second case we will set a = [xTyT]T and write (298) in the compact form
$$\displaystyle \begin{aligned} \left[\begin{array}{cc} {\mathbf{N}}_{\mathbf{a}} & {\mathbf{N}}_{\mathbf{az}}\\ {\mathbf{N}}_{\mathbf{az}}^{T} & {\mathbf{N}}_{\mathbf{z}} \end{array} \right] \left[\begin{array}{c} \hat{\mathbf{a}}\\ \hat{\mathbf{z}} \end{array}\right] = \left[\begin{array}{c} {\mathbf{u}}_{\mathbf{a}}\\ {\mathbf{u}}_{\mathbf{z}} \end{array} \right], \end{aligned} $$
(306)
with minimal constraints \({\mathbf {C}}_{\mathbf {a}}^{T}\mathbf {a} = \mathbf {d}\). The inner constraints matrix has the form \(\mathbf {E} = \left [\begin {array}{c} {\mathbf {E}}_{\mathbf {a}}\\ {\mathbf {E}}_{\mathbf {z}}\end {array}\right ]\), satisfying
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {\mathbf{N}}_{\mathbf{a}} & {\mathbf{N}}_{\mathbf{az}}\\ {\mathbf{N}}_{\mathbf{az}}^T & {\mathbf{N}}_{\mathbf{z}} \end{array} \right] \left[\begin{array}{c} {\mathbf{E}}_{\mathbf{a}}\\ {\mathbf{E}}_{\mathbf{z}} \end{array} \right] = \left[ \begin{array}{c} {\mathbf{N}}_{\mathbf{a}} {\mathbf{E}}_{\mathbf{a}} + {\mathbf{N}}_{\mathbf{az}}{\mathbf{E}}_{\mathbf{z}}\\ {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{E}}_{\mathbf{a}} + {\mathbf{N}}_{\mathbf{z}}{\mathbf{E}}_{\mathbf{z}} \end{array}\right] = \left[\begin{array}{c} \mathbf{0}\\ \mathbf{0} \end{array} \right]. \end{aligned} $$
(307)
Application of Eq. (79) gives the covariance factor matrix
$$\displaystyle \begin{aligned} \left[\begin{array}{cc} {\mathbf{Q}}_{\hat{\mathbf{a}}} & {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}}\\ {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}}^T & {\mathbf{Q}}_{\hat{\mathbf{a}}} \end{array} \right] = \left[\begin{array}{cc} {\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}}{\mathbf{C}}_{\mathbf{a}}^T & {\mathbf{N}}_{\mathbf{az}}\\ {\mathbf{N}}_{\mathbf{az}}^T & {\mathbf{N}}_{\mathbf{z}} \end{array} \right]^{-1} - \left[\begin{array}{cc} {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{a}}^T & {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{z}}^T\\ {\mathbf{E}}_{\mathbf{z}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{a}}^T & {\mathbf{E}}_{\mathbf{z}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{z}}^T \end{array} \right] \end{aligned} $$
(308)
with \(\mathbf {R} = {\mathbf {E}}^T \mathbf {C}\, {\mathbf {C}}^T \mathbf {E} = {\mathbf {E}}_a^T{\mathbf {C}}_a{\mathbf {C}}_a^T{\mathbf {E}}_a\). Analytical inversion gives
$$\displaystyle \begin{aligned} \left[\begin{array}{cc} {\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}}{\mathbf{C}}_{\mathbf{a}}^T & {\mathbf{N}}_{\mathbf{az}}\\ {\mathbf{N}}_{\mathbf{az}}^T & {\mathbf{N}}_{\mathbf{z}} \end{array} \right]^{-1} = \left[\begin{array}{cc} \mathbf{X} & \mathbf{Y}\\ {\mathbf{Y}}^T & \mathbf{W}\end{array}\right] = \left[ \begin{array}{cc} {\mathbf{Q}}_{\hat{\mathbf{a}}} + {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{a}}^{T} & {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} + {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{z}}^{T}\\ {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}}^T + {\mathbf{E}}_{\mathbf{z}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{a}}^{T} & {\mathbf{Q}}_{\hat{\mathbf{a}}} + {\mathbf{E}}_{\mathbf{z}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{z}}^{T} \end{array}\right], \end{aligned} $$
(309)
where
$$\displaystyle \begin{aligned} &\mathbf{X} = ({\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}}{\mathbf{C}}_{\mathbf{a}}^{T} - {\mathbf{N}}_{\mathbf{az}}{\mathbf{N}}_{\mathbf{z}}^{-1}{\mathbf{N}}_{\mathbf{az}}^{T})^{-1} = {\mathbf{Q}}_{\hat{\mathbf{a}}} + {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{a}}^{T},\\ &\mathbf{Y} = - \mathbf{X\,N}_{\mathbf{az}}{\mathbf{N}}_{\mathbf{z}}^{-1} = -({\mathbf{Q}}_{\hat{\mathbf{a}}} + {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{a}}^{T}){\mathbf{N}}_{\mathbf{az}}{\mathbf{N}}_{\mathbf{z}}^{-1} = {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} + {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{z}}^{T},\\ &\mathbf{W} = {\mathbf{N}}_{\mathbf{z}}^{-1} - {\mathbf{N}}_{\mathbf{z}}^{-1} {\mathbf{N}}_{\mathbf{az}}^{T}\mathbf{Y} = {\mathbf{N}}_{\mathbf{z}}^{-1} - {\mathbf{N}}_{\mathbf{z}}^{-1}{\mathbf{N}}_{\mathbf{az}}^{T}({\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} + {\mathbf{E}}_{\mathbf{a}}{\mathbf{R}}^{-1}{\mathbf{E}}_{\mathbf{z}}^{T}) = {\mathbf{Q}}_{\hat{\mathbf{z}}} + {\mathbf{E}}_{\mathbf{z}}{\mathbf{R}}^{-1} {\mathbf{E}}_{\mathbf{z}}^{T}. {} \end{aligned} $$
(310)
Solving the last two for \({\mathbf {Q}}_{\hat {\mathbf {a}}\hat {\mathbf {z}}} \) and \({\mathbf {Q}}_{\hat {\mathbf {z}}} \) we obtain
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} = - {\mathbf{Q}}_{\hat{\mathbf{a}}} {\mathbf{N}}_{\mathbf{az}} {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{E}}_{\mathbf{a}} {\mathbf{R}}^{ - 1}({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{N}}_{\mathbf{az}} - {\mathbf{E}}_{\mathbf{z}}^T {\mathbf{N}}_{\mathbf{z}} ){\mathbf{N}}_{\mathbf{z}}^{ - 1} = - {\mathbf{Q}}_{\hat{\mathbf{a}}} {\mathbf{N}}_{\mathbf{az}} {\mathbf{N}}_{\mathbf{z}}^{ - 1} , \\ &{\mathbf{Q}}_{\hat{\mathbf{z}}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} ({\mathbf{N}}_{\mathbf{az}}^T {\mathbf{E}}_{\mathbf{a}} + {\mathbf{N}}_{\mathbf{z}} {\mathbf{E}}_{\mathbf{z}} ){\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{z}}^T = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} ,{} \end{aligned} $$
(311)
since from (307) \({\mathbf {N}}_{\mathbf {az}}^T {\mathbf {E}}_{\mathbf {a}} + {\mathbf {N}}_{\mathbf {z}} {\mathbf {E}}_{\mathbf {z}} = \mathbf {0}\). Replacing a and the relative submatrices in terms of y, z, we obtain the sequential algorithm
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}} = - ({\mathbf{Q}}_{\hat{\mathbf{x}}} {\mathbf{N}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}} + {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}} {\mathbf{N}}_{\hat{\mathbf{y}}\hat{\mathbf{z}}} ){\mathbf{N}}_{\hat{\mathbf{z}}}^{ - 1} ,\qquad {\mathbf{Q}}_{\hat{\mathbf{y}}\hat{\mathbf{z}}} = - ({\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}}^T {\mathbf{N}}_{\mathbf{x}\mathbf{z}} + {\mathbf{Q}}_{\hat{\mathbf{y}}}^{T} {\mathbf{N}}_{\mathbf{y}\mathbf{z}} ){\mathbf{N}}_{\mathbf{z}}^{ - 1} , \\ &{\mathbf{Q}}_{\hat{\mathbf{z}}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} ({\mathbf{N}}_{\mathbf{x}\mathbf{z}}^T {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}} + {\mathbf{N}}_{\mathbf{y}\mathbf{z}}^T {\mathbf{Q}}_{\hat{\mathbf{y}}\hat{\mathbf{z}}} ).{} \end{aligned} $$
(312)
The above is a general result for the computation of covariance cofactor matrices of the eliminated parameters z, that will be used later.
Returning to the present case of non-adjustable observations with respect to the parameters z, we may replace the submatrices \({\mathbf {N}}_{\hat {\mathbf {z}}} \), \({\mathbf {N}}_{\hat {\mathbf {x}}\hat {\mathbf {z}}} \), \({\mathbf {N}}_{\hat {\mathbf {y}}\hat {\mathbf {z}}}\) with their particular values from Eq. (299) we obtain
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}} = - {\mathbf{Q}}_{\hat{\mathbf{x}}} ({\mathbf{A}}_{1x}^T {\mathbf{P}}_{12} {\mathbf{P}}_2^{ - 1} + {\mathbf{A}}_{2x}^{ T} ){\mathbf{A}}_{2z}^{ - T} - {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}} {\mathbf{A}}_{1y}^T {\mathbf{P}}_{12} {\mathbf{P}}_2^{ - 1} {\mathbf{A}}_{2z}^{ - T} \\ &\ \,\ \quad = {\mathbf{Q}}_{\hat{\mathbf{x}}} ({\mathbf{A}}_{1x}^T {\mathbf{Q}}_1^{ - 1} {\mathbf{Q}}_{12} - {\mathbf{A}}_{2x}^{T} ){\mathbf{A}}_{2z}^{ - T} + {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}} {\mathbf{A}}_{1y}^T {\mathbf{Q}}_1^{ - 1} {\mathbf{Q}}_{12} {\mathbf{A}}_{2z}^{ - T} , \\ &{\mathbf{Q}}_{\hat{\mathbf{y}}\hat{\mathbf{z}}} = - {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}}^T ({\mathbf{A}}_{1x}^T {\mathbf{P}}_{12} {\mathbf{P}}_2^{ - 1} + {\mathbf{A}}_{2x}^{ T} ){\mathbf{A}}_{2z}^{ - T} - {\mathbf{Q}}_{\hat{\mathbf{y}}} {\mathbf{A}}_{1y}^T {\mathbf{P}}_{12} {\mathbf{P}}_2^{ - 1} {\mathbf{A}}_{2z}^{ - T} \\ & \ \, \ \quad = {\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{y}}}^T ({\mathbf{A}}_{1x}^T {\mathbf{Q}}_1^{ - 1} {\mathbf{Q}}_{12} - {\mathbf{A}}_{2x}^{ T} ){\mathbf{A}}_{2z}^{ - T} + {\mathbf{Q}}_{\hat{\mathbf{y}}} {\mathbf{A}}_{1y}^T {\mathbf{Q}}_1^{ - 1} {\mathbf{Q}}_{12} {\mathbf{A}}_{2z}^{ - T} , \\ &{\mathbf{Q}}_{\hat{\mathbf{z}}} = ({\mathbf{A}}_{2z}^T {\mathbf{P}}_2 {\mathbf{A}}_{2z} )^{ - 1} - {\mathbf{A}}_{2z}^{ - 1} ({\mathbf{P}}_2^{ - 1} {\mathbf{P}}_{12}^T {\mathbf{A}}_{1x} + {\mathbf{A}}_{2x} ){\mathbf{Q}}_{\hat{\mathbf{x}}\hat{\mathbf{z}}} - {\mathbf{A}}_{2z}^{ - 1} {\mathbf{P}}_2^{ - 1} {\mathbf{P}}_{12}^T {\mathbf{A}}_{1y} {\mathbf{Q}}_{\hat{\mathbf{y}}\hat{\mathbf{z}}} ,{} \end{aligned} $$
(313)
where we have used the relations (297).

14 Stacking of a Coordinate Time Series from a Single Space Technique

In the joint treatment of data from the four space techniques and data related to ties at the collocation sites, we need first to form the normal equations for the stacking procedure for each technique separately. These can be used either directly for the joint ITRF solution, or separately, to obtain separate ITRF parameter estimates for each technique, which will then be combined to obtain the final joint estimates. In the second case the use of minimal constraints is necessary to obtain the separate estimates. In general, it is advisable to compute separate solutions (and not only their further needed normal equation matrices) as an intermediate diagnostic step, since problems pertaining to one data set will affect the separate estimates, but they might be masked in the joint solution under the influence of the correct information from the other data sets. This is particularly well known to surveyors where outliers can be depicted by data snooping much easier when suspicious geodetic subnetworks are independently adjusted, rather than in the adjustment of the whole of a large network.

We will start by formulating the normal equations for a particular (unspecified) space technique. In the first step we will ignore the EOP data, and then we will examine their joint treatment. Since the model for the stacking has the same form, whether we work with the original parameters or with corrections to their approximate values, we will consider only the later, since it forms the basis of an iterative solution to the original non-linear model. From the basic model
$$\displaystyle \begin{aligned} \Delta {\mathbf{x}}_i^{ob} (t_k ) = \delta {\mathbf{x}}_{0i} + (t_k - t_0 )\delta {\mathbf{v}}_i + {\mathbf{E}}_i \delta {\mathbf{z}}_k + {\mathbf{e}}_{ik} ,\quad i = 1,2,\ldots ,n,\quad k = 1,2,\ldots ,m, \end{aligned} $$
(314)
we form the observation equations for all data in epoch tk
$$\displaystyle \begin{aligned} \Delta \mathbf{x}(t_k ) & = \left[ \begin{array}{c} \vdots \\ {\Delta {\mathbf{x}}_i^{ob} (t_k )} \\ \vdots \end{array} \right] = \left[ \begin{array}{c} \vdots \\ {\delta {\mathbf{x}}_{0i} } \\ \vdots \end{array} \right] + (t_k - t_0 )\left[ \begin{array}{c} \vdots \\ {\delta {\mathbf{v}}_i } \\ \vdots \end{array} \right] + \left[ \begin{array}{c} \vdots \\ {{\mathbf{E}}_i } \\ \vdots \end{array} \right]\delta {\mathbf{z}}_k + \left[ \begin{array}{c} \vdots \\ {{\mathbf{e}}_{ik} } \\ \vdots \end{array} \right] =\\ & = \delta {\mathbf{x}}_0 + (t_k - t_0 )\delta \mathbf{v} + \mathbf{E}\delta {\mathbf{z}}_k + {\mathbf{e}}_k , \end{aligned} $$
(315)
and the total observation equations for all epochs
$$\displaystyle \begin{aligned} \mathbf{b} & {=} \left[ \begin{array}{c} \vdots \\ {\Delta \mathbf{x}(t_k )} \\ \vdots \end{array} \right] {=} \left[ \begin{array}{c} \vdots \\ {{\mathbf{I}}_{3n} } \\ \vdots \end{array} \right]\delta \mathbf{x} + \left[ \begin{array}{c} \vdots \\ {(t_k - t_0 ){\mathbf{I}}_{3n} } \\ \vdots \end{array} \right]\delta \mathbf{v} + \left[ \begin{array}{ccc} \ddots & \vdots & \ddots \\ \cdots & \mathbf{E} & \cdots \\ \ddots & \vdots & \ddots \end{array} \right]\left[ \begin{array}{c} \vdots \\ {\delta {\mathbf{z}}_k } \\ \vdots \end{array} \right] {+} \left[ \begin{array}{c} \vdots \\ {{\mathbf{e}}_k } \\ \vdots \end{array} \right] \\ & = ({\mathbf{1}}_m \otimes {\mathbf{I}}_{3n} )\delta \mathbf{x} + (\boldsymbol{\uptau } \otimes {\mathbf{I}}_{3n} )\delta \mathbf{v} + ({\mathbf{I}}_m \otimes \mathbf{E})\delta \mathbf{z} + \mathbf{e} \equiv \mathbf{J}\delta \mathbf{x} + {\mathbf{J}}_t \delta \mathbf{v} + \mathbf{G}\delta \mathbf{z} + \mathbf{e}, \end{aligned} $$
(316)
where τ has elements τk = tk − t0. In order to accommodate for the fact that not all stations participate in all observations of epoch tk, we add zero entries for the missing observations and replace the weight matrix Pk of the observations with an inflated version \(\bar {\mathbf {P}}_k \), by adding zero columns and rows corresponding to the non observing stations. The final normal equations have the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} } & {{\mathbf{N}}_{\mathbf{az}} } \\ {{\mathbf{N}}_{\mathbf{az}}^T } & {{\mathbf{N}}_{\mathbf{z}} } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}} \\ {\delta \hat{\mathbf{z}}} \end{array} \right] = \left[ \begin{array}{ccc} {{\mathbf{N}}_{\mathbf{x}} } & {{\mathbf{N}}_{\mathbf{xv}} } & {{\mathbf{N}}_{\mathbf{xz}} } \\ {{\mathbf{N}}_{\mathbf{xv}}^T } & {{\mathbf{N}}_{\mathbf{v}} } & {{\mathbf{N}}_{\mathbf{vz}} } \\ {{\mathbf{N}}_{\mathbf{xv}}^T } & {{\mathbf{N}}_{\mathbf{vz}}^T } & {{\mathbf{N}}_{\mathbf{z}} } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{x}}} \\ {\delta \hat{\mathbf{v}}} \\ {\delta \hat{\mathbf{z}}} \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{\mathbf{x}} } \\ {{\mathbf{u}}_{\mathbf{v}} } \\ {{\mathbf{u}}_{\mathbf{z}} } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{\mathbf{a}} } \\ {{\mathbf{u}}_{\mathbf{z}} } \end{array} \right], \end{aligned} $$
(317)
where \(\delta \hat {\mathbf {a}} = \left [ \begin {array}{c} {\delta \hat {\mathbf {x}}} \\ {\delta \hat {\mathbf {v}}} \end {array} \right ]\) and
$$\displaystyle \begin{aligned} &{\mathbf{N}}_{\mathbf{x}} = {\mathbf{J}}^T\mathbf{PJ} = \sum_{k = 1}^m {\bar{\mathbf{P}}_k } ,\quad {\mathbf{N}}_{\mathbf{xv}} = {\mathbf{J}}^T\mathbf{PJ}_t = \sum_{k = 1}^m {(t_k - t_0 )\bar{\mathbf{P}}_k } , \\ &{\mathbf{N}}_{\mathbf{v}} = {\mathbf{J}}_t^T \mathbf{PJ}_t = \sum_{k = 1}^m {(t_k - t_0 )^2\bar{\mathbf{P}}_k } , \\ &{\mathbf{N}}_{\mathbf{xz}} = {\mathbf{J}}^T\mathbf{PG} = \mathrm{BR(}{\mathbf{N}}_{\mathbf{xz}_k } ),\qquad \qquad \qquad {\mathbf{N}}_{\mathbf{xz}_k } = \bar{\mathbf{P}}_k \mathbf{E}, \\ &{\mathbf{N}}_{\mathbf{vz}} = {\mathbf{J}}_t^T \mathbf{PG} = \mathrm{BR(}{\mathbf{N}}_{\mathbf{vz}_k } ),\qquad \qquad \qquad {\mathbf{N}}_{\mathbf{vz}_k } = (t_k - t_0 )\bar{\mathbf{P}}_k \mathbf{E}, \\ {} &{\mathbf{N}}_{\mathbf{z}} = {\mathbf{G}}^T\mathbf{PG} = \mathrm{BD}({\mathbf{N}}_{{\mathbf{z}}_k } ),\ \; \qquad \qquad \qquad {\mathbf{N}}_{{\mathbf{z}}_k } = {\mathbf{E}}^T\bar{\mathbf{P}}_k \mathbf{E}, \end{aligned} $$
(318)
$$\displaystyle \begin{aligned} &{\mathbf{u}}_{\mathbf{x}} = {\mathbf{J}}^T\mathbf{Pb} = \sum_{k = 1}^m {\bar{\mathbf{P}}_k \Delta \mathbf{x}(t_k )} , \\ &{\mathbf{u}}_{\mathbf{v}} = {\mathbf{J}}_t^T \mathbf{Pb} = \sum_{k = 1}^m {(t_k - t_0 )\bar{\mathbf{P}}_k \Delta \mathbf{x}(t_k )} , \\ {} &{\mathbf{u}}_{\mathbf{z}} = {\mathbf{G}}^T\mathbf{Pb} = \mathrm{BC}({\mathbf{u}}_{{\mathbf{z}}_k } ),\quad {\mathbf{u}}_{{\mathbf{z}}_k } = {\mathbf{E}}^T\bar{\mathbf{P}}_k \Delta \mathbf{x}(t_k ). \end{aligned} $$
(319)
In order to express the various elements in terms of the actual per epoch weight matrices Pk, we denote with (Pk)ij be the 3 × 3 submatrix of Pk corresponding to the stations i and j, and we set \((\bar {\mathbf {P}}_k )_{ij} = \Delta _{ik} \Delta _{jk} ({\mathbf {P}}_k )_{ij} \), where Δik = 1 if station i participates in the observations of epoch tk, and Δik = 0 if it does not. We will also denote with Ti the subset of the set of epoch indices k = 1, 2, …, m in which station i participates and with Sk the subset of the set of station indices i = 1, 2, …, n corresponding to the stations participating at epoch tk. When the indices Δik appear in a summation over all epochs and stations it holds that
$$\displaystyle \begin{aligned} \sum_{i = 1}^n {\Delta_{ik} Q_i = \sum_{i \in S_k } {Q_i ,\quad } } \sum_{k = 1}^m {\Delta_{ik} Q_k = \sum_{k \in T_i } {Q_k .} } \end{aligned} $$
(320)
The submatrices Nx, Nxv and Nv share the common form \(\sum \limits _{k = 1}^m {\bar {\tau }_k \bar {\mathbf {p}}_k } \) with \(\bar {\tau }_k \) taking the respective values \(\bar {\tau }_k = 1\), \(\bar {\tau }_k = t_k - t_0 \) and \(\bar {\tau }_k = (t_k - t_0 )^2\). Setting
$$\displaystyle \begin{aligned} {\mathbf{N}}_{\bar{\tau }_k } = \sum_{k = 1}^m {\bar{\tau }_k \bar{\mathbf{p}}_k = \sum_{k = 1}^m {\bar{\tau }_k \Delta_{ik} \Delta_{jk} } } {\mathbf{P}}_k = \sum_{k \in T_i \cap T_j } {\bar{\tau }_k {\mathbf{P}}_k } \end{aligned} $$
(321)
we have \({\mathbf {N}}_{\mathbf {x}} = {\mathbf {N}}_{\bar {\tau }_{\,k}=1 } \), \({\mathbf {N}}_{\mathbf {xv}} = {\mathbf {N}}_{\bar {\tau }_k = t_k - t_0 } \), \({\mathbf {N}}_{\mathbf {v}} = {\mathbf {N}}_{\bar {\tau }_k = (t_k - t_0 )^2} \), with station submatrices of the form \(({\mathbf {N}}_{\bar {\tau }_k } )_{ij} = \sum \limits _{k \in T_i \cap T_j } {\bar {\tau }_k ({\mathbf {P}}_k )_{ij} } \), explicitly
$$\displaystyle \begin{aligned} {\mathbf{N}}_{{\mathbf{x}}_i {\mathbf{x}}_j } & = \sum_{k \in T_i \cap T_j } {({\mathbf{P}}_k )_{ij} } ,\quad {\mathbf{N}}_{{\mathbf{x}}_i {\mathbf{v}}_j } = \sum_{k \in T_i \cap T_j } {(t_k - t_0 )({\mathbf{P}}_k )_{ij} } ,\\ {\mathbf{N}}_{{\mathbf{v}}_i {\mathbf{v}}_j } &= \sum_{k \in T_i \cap T_j } {(t_k - t_0 )^2({\mathbf{P}}_k )_{ij} } . \end{aligned} $$
(322)
The remaining submatrices of N and u per epoch and/or station are computed from
$$\displaystyle \begin{aligned} &{\mathbf{N}}_{{\mathbf{x}}_i {\mathbf{z}}_k } = \Delta_{ik} \sum_{j \in S_k } {({\mathbf{P}}_k )_{ij} {\mathbf{E}}_j ,\qquad \qquad } {\mathbf{N}}_{{\mathbf{v}}_i {\mathbf{z}}_k } = (t_k - t_0 ){\mathbf{N}}_{{\mathbf{x}}_i {\mathbf{z}}_k } , \\ {} &{\mathbf{N}}_{{\mathbf{z}}_k } = \sum_{i \in S_k } {\sum_{j \in S_k } {{\mathbf{E}}_i^T } ({\mathbf{P}}_k )_{ij} {\mathbf{E}}_j ,} \end{aligned} $$
(323)
$$\displaystyle \begin{aligned} &{\mathbf{u}}_{{\mathbf{x}}_i } = \sum_{k \in T_i } {\sum_{j \in S_k } {({\mathbf{P}}_k )_{ij}} \Delta {\mathbf{x}}_j^{ob} } (t_k ),\qquad {\mathbf{u}}_{{\mathbf{v}}_i } = \sum_{k \in T_i } {(t_k - t_0 )\sum_{j \in S_k } {({\mathbf{P}}_k )_{ij} } \Delta {\mathbf{x}}_j^{ob} } (t_k ), \\ {} &{\mathbf{u}}_{{\mathbf{z}}_k } = \sum_{i \in S_k } {{\mathbf{E}}_i^T \sum_{j \in S_k } {({\mathbf{P}}_k )_{ij} } \Delta {\mathbf{x}}_j^{ob} } (t_k ). \end{aligned} $$
(324)
To form the contribution to the total ITRF normal equations we need to consider the pseudo-observations for the per epoch EOP estimates. For applications other than the ITRF formulation the stacking solution can be computed by introducing a set of minimal constraints. To take advantage of the block-diagonal form of the Nz sub-matrix which can thus be inverted very easily, it is advisable to apply partial minimal constraints involving only the initial coordinates and velocities, included in the vector \(\mathbf {a} = [\delta {\mathbf {x}}_0^T \delta {\mathbf {v}}^T]^T\), e.g., kinematic constraints or generalized partial inner constraints or a combination of the two. Simple partial inner constraints are computationally simpler to apply, keeping in mind that we can a posteriori transform the solution to that of any desired minimal constraints, as we have seen in chapter 9. The general form of the constraints will be in that case \({\mathbf {C}}_{\mathbf {a}}^T \delta \mathbf {a} = {\mathbf {d}}_{\mathbf {a}} \) and the solution will be provided by applying Eqs. (78) and (79)
$$\displaystyle \begin{gathered} {} \qquad \;\,\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}} \\ {\delta \hat{\mathbf{z}}} \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{az}} } \\ {{\mathbf{N}}_{\mathbf{az}}^T } & {{\mathbf{N}}_{\mathbf{z}} } \end{array} \right]^{ - 1}\left[ \begin{array}{c} {{\mathbf{u}}_{\mathbf{a}} } \\ {{\mathbf{u}}_{\mathbf{z}} } \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{E}}_{\mathbf{a}} ({\mathbf{C}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 1}{\mathbf{d}}_{\mathbf{a}} } \\ {{\mathbf{E}}_{\mathbf{z}} ({\mathbf{C}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 1}{\mathbf{d}}_{\mathbf{a}} } \end{array} \right], \end{gathered} $$
(325)
$$\displaystyle \begin{gathered} {} \left[ \begin{array}{cc} {{\mathbf{Q}}_{\delta \hat{\mathbf{a}}} } & {{\mathbf{Q}}_{\delta \hat{\mathbf{a}}\delta \hat{\mathbf{z}}} } \\ {{\mathbf{Q}}_{\delta \hat{\mathbf{a}}\delta \hat{\mathbf{z}}}^T } & {{\mathbf{Q}}_{\delta \hat{\mathbf{z}}} } \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{az}} } \\ {{\mathbf{N}}_{\mathbf{az}}^T } & {{\mathbf{N}}_{\mathbf{z}} } \end{array} \right]^{ - 1} - \left[ \begin{array}{cc} {{\mathbf{E}}_{\mathbf{a}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{a}}^T } & {{\mathbf{E}}_{\mathbf{a}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{z}}^T } \\ {{\mathbf{E}}_{\mathbf{z}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{a}}^T } & {{\mathbf{E}}_{\mathbf{z}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{z}}^T } \end{array} \right],\\ \mathbf{R} = {\mathbf{E}}_{\mathbf{a}}^T {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} . \end{gathered} $$
(326)
Since the matrix Nz is block diagonal it can be easily inverted and utilized in a procedure that is similar to the elimination of \(\delta \hat {\mathbf {z}}\) in the case with full rank, where no constraints are implemented. Analytical inversion gives
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{az}} } \\ {{\mathbf{N}}_{\mathbf{az}}^T } & {{\mathbf{N}}_{\mathbf{z}} } \end{array} \right]^{ - 1} = \left[ \begin{array}{cc} {{\mathbf{G}}_{\mathbf{a}} } & {{\mathbf{G}}_{\mathbf{az}} } \\ {{\mathbf{G}}_{\mathbf{az}}^T } & {{\mathbf{G}}_{\mathbf{z}} } \end{array} \right], \end{aligned} $$
(327)
where
$$\displaystyle \begin{aligned} &{\mathbf{G}}_{\mathbf{a}} = ({\mathbf{N}}_{\mathbf{a}} - {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T )^{ - 1},\quad {\mathbf{G}}_{\mathbf{az}} = - {\mathbf{G}}_{\mathbf{a}} {\mathbf{N}}_{\mathbf{az}} {\mathbf{N}}_{\mathbf{z}}^{ - 1} , \\ {} &{\mathbf{G}}_{\mathbf{z}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{G}}_{\mathbf{az}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} + {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{G}}_{\mathbf{a}} {\mathbf{N}}_{\mathbf{az}} {\mathbf{N}}_{\mathbf{z}}^{ - 1} . \end{aligned} $$
(328)
With these values and setting
$$\displaystyle \begin{aligned} \bar{\mathbf{N}}_{\mathbf{a}} = {\mathbf{N}}_{\mathbf{a}} - {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}} ,\qquad \qquad \bar{\mathbf{u}}_{\mathbf{a}} = {\mathbf{u}}_{\mathbf{a}} - {\mathbf{N}}_{\mathbf{az}} {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{u}}_{\mathbf{z}} , \end{aligned} $$
(329)
we arrive at the following algorithm
$$\displaystyle \begin{aligned} &\delta \hat{\mathbf{a}} = (\bar{\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T )^{ - 1}\bar{\mathbf{u}}_{\mathbf{a}} + {\mathbf{E}}_{\mathbf{a}} ({\mathbf{C}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 1}{\mathbf{d}}_{\mathbf{a}} , \end{aligned} $$
(330)
$$\displaystyle \begin{aligned} &\delta \bar{\mathbf{z}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{u}}_{\mathbf{z}} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}}^T \delta \hat{\mathbf{a}}, \end{aligned} $$
(331)
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\delta \hat{\mathbf{a}}} = (\bar{\mathbf{N}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T )^{ - 1} - {\mathbf{E}}_{\mathbf{a}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{a}}^T , \end{aligned} $$
(332)
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\delta \hat{\mathbf{a}}\delta \hat{\mathbf{z}}} = - {\mathbf{Q}}_{\delta \hat{\mathbf{a}}} {\mathbf{N}}_{\mathbf{az}} {\mathbf{N}}_{\mathbf{z}}^{ - 1} , \end{aligned} $$
(333)
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\delta \hat{\mathbf{z}}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{Q}}_{\delta \hat{\mathbf{a}}\delta \hat{\mathbf{z}}} . \end{aligned} $$
(334)
The last two equations are a direct application of the general equations (311) derived in Sect. 13.3. The above algorithm is completely analogous with the algorithm based on parameter elimination, in the case of full rank design matrix, in which case \(\bar {\mathbf {N}}_{\mathbf {a}} \) is invertible and no constraints are needed. The only difference is that Eqs. (330) and (332) have replaced the equations \(\delta \hat {\mathbf {a}} = \bar {\mathbf {N}}_{\mathbf {a}}^{ - 1} \bar {\mathbf {u}}_{\mathbf {a}} \) and \({\mathbf {Q}}_{\hat {\mathbf {a}}} = \bar {\mathbf {N}}_{\mathbf {a}}^{ - 1} \), respectively, of the full rank case. The matrices \(\bar {\mathbf {N}}_{\mathbf {a}} \), \(\bar {\mathbf {u}}_{\mathbf {a}} \) appearing in the above algorithm have the explicit values
$$\displaystyle \begin{aligned} \bar{\mathbf{N}}_{\mathbf{a}} &= \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\mathbf{x}} } & {\bar{\mathbf{N}}_{\mathbf{xv}} } \\ {\bar{\mathbf{N}}_{\mathbf{xv}}^T } & {\bar{\mathbf{N}}_{\mathbf{v}} } \end{array} \right],\quad \bar{\mathbf{u}}_{\mathbf{a}} = \left[ \begin{array}{c} {\bar{\mathbf{u}}_{\mathbf{x}} } \\ {\bar{\mathbf{u}}_{\mathbf{v}} } \end{array} \right], \end{aligned} $$
(335)
$$\displaystyle \begin{aligned} \bar{\mathbf{N}}_{\mathbf{x}} &= {\mathbf{N}}_{\mathbf{x}} - \sum_{k = 1}^m {{\mathbf{N}}_{\mathbf{xz}_k } } {\mathbf{N}}_{{\mathbf{z}}_k }^{ - 1} {\mathbf{N}}_{\mathbf{xz}_k }^T ,\quad \bar{\mathbf{N}}_{\mathbf{xv}} = {\mathbf{N}}_{\mathbf{xv}} - \sum_{k = 1}^m {{\mathbf{N}}_{\mathbf{xz}_k } } {\mathbf{N}}_{{\mathbf{z}}_k }^{ - 1} {\mathbf{N}}_{\mathbf{vz}_k }^T ,\\ \bar{\mathbf{N}}_{\mathbf{v}} &= {\mathbf{N}}_{\mathbf{v}} - \sum_{k = 1}^m {{\mathbf{N}}_{\mathbf{vz}_k } } {\mathbf{N}}_{{\mathbf{z}}_k }^{ - 1} {\mathbf{N}}_{\mathbf{vz}_k }^T, \end{aligned} $$
(336)
$$\displaystyle \begin{aligned} \bar{\mathbf{u}}_{\mathbf{x}} &= {\mathbf{u}}_{\mathbf{x}} - \sum_{k = 1}^m {{\mathbf{N}}_{\mathbf{xz}_k } } {\mathbf{N}}_{{\mathbf{z}}_k }^{ - 1} {\mathbf{u}}_{{\mathbf{z}}_k } ,\qquad \bar{\mathbf{u}}_{\mathbf{v}} = {\mathbf{u}}_{\mathbf{x}} - \sum_{k = 1}^m {{\mathbf{N}}_{\mathbf{vz}_k } } {\mathbf{N}}_{{\mathbf{z}}_k }^{ - 1} {\mathbf{u}}_{{\mathbf{z}}_k } . \end{aligned} $$
(337)
In the particular case that inner partial inner constraints \({\mathbf {E}}_{\mathbf {a}}^T \delta \mathbf {a} = \mathbf {0}\) are used, we need only to replace (330) and (332) with
$$\displaystyle \begin{aligned} &\delta \hat{\mathbf{a}} = (\bar{\mathbf{N}}_{\mathbf{a}} + {\mathbf{E}}_{\mathbf{a}} {\mathbf{E}}_{\mathbf{a}}^T )^{ - 1}\bar{\mathbf{u}}_{\mathbf{a}} , \end{aligned} $$
(338)
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\delta \hat{\mathbf{a}}} = (\bar{\mathbf{N}}_{\mathbf{a}} + {\mathbf{E}}_{\mathbf{a}} {\mathbf{E}}_{\mathbf{a}}^T )^{ - 1} - {\mathbf{E}}_{\mathbf{a}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{a}}^T . \end{aligned} $$
(339)
Since we can a posteriori convert the stacking solution to one corresponding to any desired constraints, it is advantageous to use the above much simpler algorithm in order to derive first the solution with partial inner constraints \({\mathbf {E}}_{\mathbf {a}}^T \delta \mathbf {a} = \mathbf {0}\) involving only initial coordinates and velocities. For this reason, we give the explicit algorithm for the partial inner constraints solution:
$$\displaystyle \begin{gathered} {} \left[ \begin{array}{c} {\delta \hat{\mathbf{x}}_0 } \\ {\delta \hat{\mathbf{v}}} \end{array} \right] = \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\mathbf{x}} + \mathbf{EE}^T} & {\bar{\mathbf{N}}_{\mathbf{xv}} } \\ {\bar{\mathbf{N}}_{\mathbf{xv}}^T } & {\bar{\mathbf{N}}_{\mathbf{v}} + \mathbf{EE}^T} \end{array} \right]^{ - 1}\left[ \begin{array}{c} {\bar{\mathbf{u}}_{\mathbf{x}} } \\ {\bar{\mathbf{u}}_{\mathbf{v}} } \end{array} \right], \end{gathered} $$
(340)
$$\displaystyle \begin{gathered} {} \delta \hat{\mathbf{z}}_k = {\mathbf{N}}_{{\mathbf{z}}_k }^{ - 1} ({\mathbf{u}}_{{\mathbf{z}}_k } - {\mathbf{N}}_{\mathbf{xz}_k }^T \delta \hat{\mathbf{x}}_0 - {\mathbf{N}}_{\mathbf{vz}_k }^T \delta \hat{\mathbf{v}}), \end{gathered} $$
(341)
$$\displaystyle \begin{gathered} {} \left[ \begin{array}{cc} {{\mathbf{Q}}_{\delta \hat{\mathbf{x}}_0 } } & {{\mathbf{Q}}_{\delta \hat{\mathbf{x}}_0 \delta \hat{\mathbf{v}}} } \\ {{\mathbf{Q}}_{\delta \hat{\mathbf{x}}_0 \delta \hat{\mathbf{v}}}^T } & {{\mathbf{Q}}_{\delta \hat{\mathbf{v}}} } \end{array} \right] = \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\mathbf{x}} + \mathbf{EE}^T} & {\bar{\mathbf{N}}_{\mathbf{xv}} } \\ {\bar{\mathbf{N}}_{\mathbf{xv}}^T } & {\bar{\mathbf{N}}_{\mathbf{v}} + \mathbf{EE}^T} \end{array} \right]^{ - 1}-\\ - \left[ \begin{array}{cc} {\mathbf{E}({\mathbf{E}}^T\mathbf{E})^{ - 2}{\mathbf{E}}^T} & \mathbf{0} \\ \mathbf{0} & {\mathbf{E}({\mathbf{E}}^T\mathbf{E})^{ - 2}{\mathbf{E}}^T} \end{array} \right], \end{gathered} $$
(342)
$$\displaystyle \begin{gathered} {\mathbf{Q}}_{\delta \hat{\mathbf{x}}_{0}\delta \hat{\mathbf{z}}} = - ({\mathbf{Q}}_{\delta \hat{\mathbf{x}}_0 } {\mathbf{N}}_{\mathbf{xz}} + {\mathbf{Q}}_{\delta \hat{\mathbf{x}}_0 \delta \hat{\mathbf{v}}} {\mathbf{N}}_{\mathbf{vz}} ){\mathbf{N}}_{\mathbf{z}}^{ - 1} , \\ {\mathbf{Q}}_{\delta \hat{\mathbf{v}}\delta \hat{\mathbf{z}}} = - ({\mathbf{Q}}_{\delta \hat{\mathbf{x}}_0 \delta \hat{\mathbf{v}}}^T {\mathbf{N}}_{\mathbf{xz}} + {\mathbf{Q}}_{\delta \hat{\mathbf{v}}} {\mathbf{N}}_{\mathbf{vz}} ){\mathbf{N}}_{\mathbf{z}}^{ - 1} , \\ {} {\mathbf{Q}}_{\delta \hat{\mathbf{z}}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} ({\mathbf{N}}_{\mathbf{xz}}^T {\mathbf{Q}}_{\delta \hat{\mathbf{x}}\delta \hat{\mathbf{z}}} + {\mathbf{N}}_{\mathbf{vz}}^{ - 1} {\mathbf{Q}}_{\delta \hat{\mathbf{v}}\delta \hat{\mathbf{z}}} ). \end{gathered} $$
(343)

15 The Implementation of Time Series of Earth Orientation Parameters

When stacking is performed for the purpose of ITRF formulation, it is necessary to take into account the available EOP time series, so that the relevant information from all space techniques can be used to obtain combined EOP estimates in addition to the station initial coordinates and velocities, which are the main goal. As done with epoch coordinates, the EOP estimates of each epoch refer to the particular reference system defined through the applied minimal constraints. In order to use them as pseudo-observations we must know the equations which convert EOPs in the (yet undefined) stacking reference system to the reference system of the particular epoch. More general we need to know how EOPs are transformed under a change of the terrestrial reference system. Since EOPs refer to directions, they are affected only by the rotational part of the reference system transformation and not on the displacement or scale part.

Table 1 presents the general transformation equations for EOPs under changes of either the terrestrial reference system (TRS), or the celestial reference system (CRS) or both (TRS + CRS). The change in the TRS is expressed by \(\tilde {\mathbf {x}}_T = \mathbf {R}(\boldsymbol {\uptheta }){\mathbf {x}}_T \approx (\mathbf {I} - [\boldsymbol {\uptheta }\times ]){\mathbf {x}}_T \) in terms of rotation angles θ = [θ1θ2θ3]T around the three axes, while the change in the CRS is analogously expressed by \(\tilde {\mathbf {x}}_C = \mathbf {R}(\boldsymbol {\psi }){\mathbf {x}}_C \approx (\mathbf {I} - [\boldsymbol {\uppsi }\times ]){\mathbf {x}}_C \) with angles ψ = [ψ1ψ2ψ3]T around the three axes. EOP transformation laws are given not only for the CIP but also for the instantaneous rotation axis as well as to the angular momentum axis, which is a more stable axis not following the high frequency precession-nutation variations of the instantaneous rotation axis. The CIP here does not directly relate to its rather vague theoretical definition (smoothed version of the instantaneous rotation axis after removal of sub-daily precession-nutation terms) but rather to its operational realization. According to the IERS conventions, the CIP axis is defined directly in relation to the CRS through the smoothed version of the theoretically predicted precession-nutation, after the latter are corrected, according to observational evidence provided by VLBI. The EOPs into consideration are the first two components of the unit vector of the rotation axis with respect to the CRS X, Y , the polar motion components xP, yP, which relate to the components of the rotation axis with respect to the TRS and the earth rotation angle θ and the length of the day (LOD) Λ. Λ is related to the angular velocity ω through Λ = 2πω. An alternative to LOD is universal time UT1 expressed as Tu = (Julian UT1 date − 2451545.0). It relates to the earth rotation angle θ through θ = 2π(A + BTu) where
$$\displaystyle \begin{aligned} A = 7790572732640\ \mathrm{and}\ B = 1.00273781191135448, \end{aligned}$$
as well as to the IERS provided differences UT1-UTC through the trivial relation UT1=UTC+(UT1-UTC). With respect to the EOP time series provided by the four space techniques, VLBI provides daily polar motion, polar motion rates, LOD and UT1-UTC, SLR provides polar motion and LOD on both a weekly and fortnightly rate, GPS provides daily polar motion, polar motion rates and LOD, while DORIS provides only polar motion on a weekly basis. The relations for EOPs related to the CIP under changes of only the terrestrial reference system, are the ones actually used in the ITRF formulation. They have been adapted from relations developed, for the now abandoned classical earth rotation representation, by Zhu and Mueller [63].
Table 1

Variation of EOPs under changes \(\tilde {\mathbf {x}}_T = \mathbf {R}( \boldsymbol {\uptheta }){\mathbf {x}}_T = (\mathbf {I} - [ \boldsymbol {\uptheta }\times ]){\mathbf {x}}_T \) (time depended) in the terrestrial reference system (TRS), \(\tilde {\mathbf {x}}_C = \mathbf {R}( \boldsymbol {\uppsi }){\mathbf {x}}_C = (\mathbf {I} - [ \boldsymbol {\uppsi }\times ]){\mathbf {x}}_C \) (time fixed) in the celestial reference system (CRS), or both

Change of

EOPs related to CIP

EOPs related to the rotation vector

EOPs related to the angular momentum

TRS + CRS

\(\tilde { x}_P = x_P - \theta _2 \)

Open image in new window

\(\tilde {\xi }_h = \xi _h - \theta _2 \)

 

\(\tilde {y}_P = y_P - \theta _1 \)

Open image in new window

\(( - \tilde {\eta }_h ) = ( - \eta _h ) - \theta _1 \)

 

\(\tilde {X} - X - \psi _2 \)

\(\tilde {X} = X - \psi _2+ \hfill \hspace{15pt} +\frac {\cos \theta \,\dot {\theta }_1 - \sin \theta \,\dot {\theta }_2 }{\omega }\)

\(\tilde {X}_h - X_h - \psi _2 \)

 

\(\tilde {Y} = Y + \psi _1 \)

\(\tilde {Y} = Y + \psi _1+ \hfill \hspace{15pt} +\frac {\sin \theta \,\dot {\theta }_1 - \cos \theta \,\dot {\theta }_2 }{\omega }\)

\(\tilde {Y}_h - Y_h + \psi _1 \)

 

Open image in new window

TRS

\(\tilde {x}_P = x_P - \theta _2 \)

Open image in new window

\(\tilde {\xi }_h = \xi _h - \theta _2 \)

 

\(\tilde {y}_P = y_P - \theta _1 \)

Open image in new window

\(( - \tilde {\eta }_h ) = ( - \eta _h ) - \theta _1 \)

 

\(\tilde {X} = X\)

Open image in new window

\(\tilde {X}_h = X_h \)

 

\(\tilde {Y} = Y\)

Open image in new window

\(\tilde {Y}_h = Y_h \)

 

Open image in new window

CRS

\(\tilde {x}_P = x_p \)

\(\tilde {x}_P = x_p \)

\(\tilde {\xi }_h = \xi _h \)

 

\(\tilde {y}_P = y_P \)

\(\tilde {y}_P = y_P \)

\(\tilde {\eta }_h = \eta _h \)

 

\(\tilde {X} = X - \psi _2 \)

\(\tilde {X} = X - \psi _2 \)

\(\tilde {X}_h = X_h - \psi _2 \)

 

\(\tilde {Y} = Y + \psi _1 \)

\(\tilde {Y} = Y + \psi _1 \)

\(\tilde {Y}_h = Y_h + \psi _1 \)

 

Open image in new window

aOpen image in new window = 1.00273781191135448

From the TRS/CIP entries in Table 1, it follows that the EOP pseudo-observations at each epoch tk have the form qobs(tk) = qk + T0zk + eq,k, where qk are the unknown EOP parameters in the final ITRF reference system, zk are the transformation parameters from the ITRF reference system to that of the epoch tk , T0 is the known design matrix and eq,k are the relevant errors. In reduced form they become
$$\displaystyle \begin{aligned} {\mathbf{b}}_{q,k} = {\mathbf{q}}^{obs}(t_k ) - {\mathbf{T}}_0 {\mathbf{z}}_k^{ap} = {\mathbf{q}}_k + {\mathbf{T}}_0 \delta {\mathbf{z}}_k + {\mathbf{e}}_{q,k} .{} \end{aligned} $$
(344)
As an example, when polar motion and UT1 are observed the above equation takes the specific form
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {x_P^{obs} (t_k )} \\ {y_P^{obs} (t_k )} \\ {\mathrm{UT1}^{obs}(t_k )} \end{array} \right] = \left[ \begin{array}{c} {x_P (t_k )} \\ {y_P (t_k )} \\ {\mathrm{UT1}(t_k )} \end{array} \right] + \left[ \begin{array}{ccc} 0 & { - 1} & 0 \\ { - 1} & 0 & 0 \\ 0 & 0 & {\frac{1}{2\pi B}} \end{array} \right]\left[ \begin{array}{c} {\theta_1 (t_k )} \\ {\theta_2 (t_k )} \\ {\theta_3 (t_k )} \end{array} \right] + \left[ \begin{array}{c} {e_{x_P } (t_k )} \\ {e_{y_P } (t_k )} \\ {e_{\mathrm{UT}1} (t_k )} \end{array} \right]. \end{aligned} $$
(345)
The particular form of bq,k = qk + T0δzk + eq,k depends on which EOPs are provided by each technique. This model does not include time series of the form \(\dot {\mathbf {q}}^{obs}(t_k ) = \dot {\mathbf {q}}_k + {\mathbf {T}}_0 \delta \dot {\mathbf {z}}_k + \dot {\mathbf {e}}_{q,k} \), since these depend on the derivatives \(\delta \dot {\mathbf {z}}_k \) of the transformation parameters δzk, which are not present in the stacking model for coordinate time series and thus constitute additional unknown parameters and over-parameterization. In principle one can use a simple local interpolation depending on the 2j + 1 neighbor values, e.g., δzkj, …, δzk, …, δzk+j, to obtain a local function δzk in the form of a polynomial of degree 2j. Differentiation and evaluation at tk, will produce \(\delta \dot {\mathbf {z}}_k = \delta \dot {\mathbf {z}}(t_k )\) which will depend on the neighboring values taken into account in the interpolation. For example, a simple 3-point Lagrangean interpolation of a second degree polynomial δz(t) = a + bt + ct2 based on the values δzk−1, δzk, δzk+1 will produce the relation
$$\displaystyle \begin{aligned} \delta\dot{\mathbf{z}}_k & = - \frac{(t_{k + 1} - t_k )}{(t_k - t_{k - 1} )(t_{k + 1} - t_{k - 1} )}\delta{\mathbf{z}}_{k - 1} + \frac{(t_{k + 1} - 2t_k - t_{k - 1} )}{(t_k - t_{k - 1} )(t_{k + 1} - t_k )}\delta{\mathbf{z}}_k +\\ &\quad + \frac{(t_k - t_{k - 1} )}{(t_{k + 1} - t_k )(t_{k + 1} - t_{k - 1} )}\delta{\mathbf{z}}_{k + 1} . \end{aligned} $$
(346)
Even simpler piecewise linear interpolation schemes are possible, based on only two successive values, such as \(\delta \dot {\mathbf {z}}_k = \frac {1}{t_{k + 1} - t_k }(\delta {\mathbf {z}}_{k + 1} - \delta {\mathbf {z}}_k )\) and \(\delta \dot {\mathbf {z}}_k = \frac {1}{t_k - t_{k - 1} }(\delta {\mathbf {z}}_k - \delta {\mathbf {z}}_{k - 1} )\), or their mean value
$$\displaystyle \begin{aligned} \delta \dot{\mathbf{z}}_k = - \frac{1}{2(t_k - t_{k - 1} )}\delta {\mathbf{z}}_{k - 1} + \frac{t_{k + 1} - 2t_k + t_{k - 1} }{2(t_k - t_{k - 1} )(t_{k + 1} - t_k )}\delta {\mathbf{z}}_k + \frac{1}{2(t_{k + 1} - t_k )}\delta {\mathbf{z}}_{k + 1} . \end{aligned} $$
(347)
We strongly advise against such a procedure, because the transformation parameters do not account only for a change of the spatiotemporal reference system but they also serve to absorb variations due to observational errors. Derivation is a procedure that strongly amplifies the effect of such errors, and the resulting time series of EOP derivatives will be in general too erratic to be of practical use. In order to combine EOP derivatives from the four space techniques in the final ITRF formulation, it is at least necessary to convert them from the reference system of each epoch to the spatiotemporal reference system of stacking. This can be done by a posteriori conversion based on the estimates \(\delta \hat {\mathbf {z}}_k \) obtained by the stacking of each separate techniques, without including the EOP derivative as pseudo-observations. This will also require an interpolation scheme, with doubtable effectiveness. In our opinion EOP derivatives should not be considered either in stacking per technique or in ITRF formulation, because they pose more problems than their prospective usefulness.
For all epochs we have the EOP observation equations
$$\displaystyle \begin{aligned} {\mathbf{b}}_q = \mathbf{q} + \mathbf{T}\delta \mathbf{z} + {\mathbf{e}}_q . \end{aligned} $$
(348)
The uncorrelated EOP observations must be combined with the observation equations for the previously analyzed coordinate time series \(\mathbf {b} = {\mathbf {A}}_{\mathbf {x}} \delta \hat {\mathbf {x}}_0 + {\mathbf {A}}_{\mathbf {v}} \delta \hat {\mathbf {v}} + {\mathbf {A}}_{\mathbf {z}} \delta \hat {\mathbf {z}} +\)\({\mathbf {e}}_b = {\mathbf {A}}_{\mathbf {a}} \delta \hat {\mathbf {a}} + {\mathbf {A}}_{\mathbf {z}} \delta \hat {\mathbf {z}} + {\mathbf {e}}_b \) into the joint observation equations
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} \mathbf{b} \\ {{\mathbf{b}}_q } \end{array} \right] = \left[ \begin{array}{ccc} {{\mathbf{A}}_{\mathbf{a}} } & {{\mathbf{A}}_{\mathbf{z}} } & \mathbf{0} \\ \mathbf{0} & \mathbf{T} & \mathbf{I} \end{array} \right]\left[ \begin{array}{c} {\delta \mathbf{a}} \\ {\delta \mathbf{z}} \\ \mathbf{q} \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{e}}_b } \\ {{\mathbf{e}}_q } \end{array} \right]. \end{aligned} $$
(349)
The only common parameters between the two sets b and bq are the transformation parameters δz. In the EOP time series observation model bq = q + Tδz + eq, the EOPs q are non-common parameters and their implied design submatrix is the non-singular identity matrix I. As already seen in section 6.2., this mean that the EOP pseudo-observations bq are non-adjustable observations with respect to the EOP parameters q. Thus the results of Propositions 6 or 7 apply, depending on whether b and bq are correlated or not. We will examine first the no correlation situation where the joint weight matrix has the form \(\left [ \begin {array}{cc} {{\mathbf {P}}_b } & \mathbf {0} \\ \mathbf {0} & {{\mathbf {P}}_q } \end {array} \right ]\). Then according to Proposition 6, we can obtain the optimal estimates of \(\delta \hat {\mathbf {x}}_0 \), \(\delta \hat {\mathbf {v}}\), \(\delta \hat {\mathbf {z}}\), adjusting only the coordinate time series observation b, performing the stacking without EOPs as described in the previous section 6.3. The optimal estimates of the EOPs \(\hat {\mathbf {q}}\) can be directly computed from the error-free model \({\mathbf {b}}_q = \mathbf {q} + \mathbf {T}\delta \hat {\mathbf {z}}\), as
$$\displaystyle \begin{aligned} \hat{\mathbf{q}} = {\mathbf{b}}_q - \mathbf{T}\delta \hat{\mathbf{z}}. \end{aligned} $$
(350)
The situation is quite different though when correlation between coordinate time series and EOP time series is taken into account and the weight matrix for the joint observation equation (349) has the form \(\left [ \begin {array}{cc} {{\mathbf {P}}_b } & {{\mathbf {P}}_{bq} } \\ {{\mathbf {P}}_{bq}^T } & {{\mathbf {P}}_q } \end {array} \right ]\). Then according to Proposition 7, we can obtain the optimal estimates of \(\delta \hat {\mathbf {x}}_0 \), \(\delta \hat {\mathbf {v}}\), \(\delta \hat {\mathbf {z}}\), adjusting only the coordinate time series observation b, performing the stacking without EOPs as described in the previous Sect. 13.3. Once \(\delta \hat {\mathbf {x}}_0 \), \(\delta \hat {\mathbf {v}}\), \(\delta \hat {\mathbf {z}}\) have been estimated from the coordinate time series, the EOP estimates may be obtained utilizing equation (302), which in this case (b1 →b, b2 →bq, x → δa, y → δz, z →q, A1x →Aa, A1y →Az, A2z →I, A2x →T) takes the form
$$\displaystyle \begin{aligned} \hat{\mathbf{q}} = {\mathbf{b}}_q - \mathbf{T}\delta \hat{\mathbf{z}} - {\mathbf{Q}}_{bq}^T {\mathbf{Q}}_b^{ - 1} (\mathbf{b} - {\mathbf{A}}_{\mathbf{a}} \delta \hat{\mathbf{a}} - {\mathbf{A}}_{\mathbf{z}} \delta \hat{\mathbf{z}}). \end{aligned} $$
(351)
Note that the last equation can be interpreted as a two-step procedure. In the first step the coordinate error estimates \(\hat {\mathbf {e}}_b = \mathbf {b} - {\mathbf {A}}_{\mathbf {a}} \delta \hat {\mathbf {a}}_q - {\mathbf {A}}_{\mathbf {z}} \delta \hat {\mathbf {z}}_q \) are used to predict the EOP error \(\tilde {\mathbf {e}}_q = {\mathbf {Q}}_{bq}^T {\mathbf {Q}}_b^{ - 1} \hat {\mathbf {e}}_b \) and then from the model consistency \({\mathbf {b}}_q = \hat {\mathbf {q}} + \mathbf {T}\delta \hat {\mathbf {z}} + \tilde {\mathbf {e}}_q \), the EOP estimates follow as \(\hat {\mathbf {q}} = {\mathbf {b}}_q - \mathbf {T}\delta \hat {\mathbf {z}} - \tilde {\mathbf {e}}_q \). In addition to the separately derived covariance factor matrices \({\mathbf {Q}}_{\delta \hat {\mathbf {a}}} \), \({\mathbf {Q}}_{\delta \hat {\mathbf {z}}} \), \({\mathbf {Q}}_{\delta \hat {\mathbf {z}}\delta \hat {\mathbf {a}}} \), additional ones related to the EOP estimates can be computed sequentially utilizing equations (311), which in this case take the form
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\delta \hat{\mathbf{z}},\hat{\mathbf{q}}} = {\mathbf{Q}}_{\delta \hat{\mathbf{z}}} ({\mathbf{A}}_{\mathbf{z}}^T {\mathbf{Q}}_b^{ - 1} {\mathbf{Q}}_{bq} - {\mathbf{T}}^T) + {\mathbf{Q}}_{\delta \hat{\mathbf{z}},\delta \hat{\mathbf{a}}} {\mathbf{A}}_{\mathbf{a}}^T {\mathbf{Q}}_b^{ - 1} {\mathbf{Q}}_{bq} , \\ &{\mathbf{Q}}_{\delta \hat{\mathbf{a}},\hat{\mathbf{q}}} = {\mathbf{Q}}_{\delta \hat{\mathbf{z}},\delta \hat{\mathbf{a}}}^T ({\mathbf{A}}_{\mathbf{z}}^T {\mathbf{Q}}_b^{ - 1} {\mathbf{Q}}_{bq} - {\mathbf{T}}^T) + {\mathbf{Q}}_{\delta \hat{\mathbf{a}}} {\mathbf{A}}_{\mathbf{a}}^T {\mathbf{Q}}_b^{ - 1} {\mathbf{Q}}_{bq} , \\ {} &{\mathbf{Q}}_{\hat{\mathbf{q}}} = {\mathbf{Q}}_q - {\mathbf{Q}}_{bq}^T {\mathbf{Q}}_b^{ - 1} {\mathbf{Q}}_{bq} + ({\mathbf{Q}}_{bq}^T {\mathbf{Q}}_q^{ - 1} {\mathbf{A}}_{\mathbf{z}} - \mathbf{T}){\mathbf{Q}}_{\delta \hat{\mathbf{z}},\hat{\mathbf{q}}} + {\mathbf{Q}}_{bq}^T {\mathbf{Q}}_b^{ - 1} {\mathbf{A}}_{\mathbf{a}} {\mathbf{Q}}_{\delta \hat{\mathbf{a}},\hat{\mathbf{q}}} . \end{aligned} $$
(352)
The question whether correlation between EOPs and coordinate time series, within data of the same epoch should be taken into account, must be answered in the positive if the assumptions on which the data analysis is based (correct deterministic and stochastic model, zero mean errors, correct error covariances, no systematic errors) is consistent with reality. If not, one might prefer not to let problems in the determination of EOPs within each epoch have an undesired influence on the quality of the estimates for initial coordinates and velocities. Computation experience has shown very small differences in the results obtained from the above two possible approaches.

16 ITRF Formulation: Combination of Initial Coordinates, Velocities and EOPs Obtained Separately from Each Space Technique

In order to combine different sets of data, these must be connected, either in the deterministic part of their model through the existence of common unknown parameters or in the stochastic part through the existence of correlations, or in both ways. Since data from different space techniques are naturally uncorrelated, the existence of common parameters is the only possible way. There are two types of common parameters. The first are the coordinate-related parameters of stations at the collocation sites, which appear as common parameters between the data from one technique and the data from local surveys performed for ties between nearby stations observing with different space techniques. The other are the EOP parameters, which must be common in all techniques that provide them, when they are expressed in the final ITRF reference system.

Of the two sets, the EOPs are by far the most problematic. Common EOP parameters in the strict sense are (1) daily polar motion parameters, their rates and LOD from VLBI and GPS and (2) weakly polar motion from SLR and DORIS. One of the problems is how to connect daily with weekly EOP estimates of the same type. They are certainly interrelated but not exactly the same in order to be directly included as common parameters in the ITRF formulation adjustment. They are both a (not clearly defined) type of average of the relevant instantaneous values at the observation instants of the data analyzed to produce the daily or weekly estimates. Such averages depend on both the type of the observations and their temporal distribution within the data analysis interval.

A more serious problem is the compatibility of EOPs from different techniques, with respect to which pole they refer. Nominally, they are all referring to the CIP, but it is highly questionable how this reference is realized, provided that the CIP is a conventionally constructed concept and not a physical object. Only the three parameters of the rotation matrix connecting the TRS to the CRS can be considered as physical objects, but these appear only in VLBI observations. SLR, GPS and DORIS sense earth rotation through the satellite orbits, which when viewed in the TRS they formally depend on the instantaneous rotation vector which appears the pseudo-forces caused by the TRS rotation. Even the rotation vector, as any type of velocity, is not a physical object but a mathematical concept, which becomes physically accessible only in the case of time continuous observations and not discrete ones as in the case of geodetic space techniques. All these EOP-related problems make even more important the role of the ties at collocation sites in the joint analysis of all data in the ITRF formulation. In addition, they pose the serious problem of how to treat EOP parameters in the ITRF formulation, in a way that their ambiguities have no negative influence on the quality of initial station coordinates and velocities, which are the main ITRF products. In any case we will present here also the “rigorous” combination of coordinate time series from all techniques, also including EOP time series, for the sake of completeness, although we by no means endorse this approach.

The local ties at collocation sites are performed either by the use of classical terrestrial measurements (direction angles, distances, and spirit leveling) or by the GPS technique. When the GPS technique is used, the resulting station-to-station vectors are already aligned to a GPS-related reference system. In this case we do not need to express the observed displacement vector as a function of its ITRF components and the rotation angles from the ITRF to the GPS reference system. Indeed, a rotational ambiguity that will cause a 10 m displacement on the earth surface (an exaggerated value!) will cause a displacement of one tenth of a millimeter over a 50 m baseline. Thus observed baselines may be modeled as already referred to the final ITRF reference system. The situation is quite different when only terrestrial observations are performed in local ties and the resulting vector is referred to the local astronomic frame. To maintain an accuracy of one tenth of a millimeter over a 50 m baselines in the conversion to a global reference system, one needs a knowledge of the direction of the vertical down to 0.1 mas, which is quite impossible to achieve. Modeling the observed vectors as functions of both the coordinate differences in the ITRF reference system and the rotation angles from the local horizontal to the global ITRF system has the disadvantage that each vector adds as many unknowns (rotation angles) as observations (vector components) except for sites where more than two space techniques are collocated. For this reason, it is necessary that local terrestrial techniques include connections with a surrounding geodetic network, existing or created for this purpose, which is already aligned to a global network through GPS observations.

We may look upon the ITRF formulation as a simultaneous stacking problem, to which the input data are the coordinate time series at each technique, plus the local ties data, plus the EOP time series for each technique. The basic difference with the single technique stacking and EOP estimation, is that now the EOP data refer to unknowns common to all techniques and not to each technique separately. We will see how the estimates from the single technique adjustments can be used in the ITRF formulation without actually performing the solution with all available data in one step. The latter will be formulated only theoretically in order to connect its results with the estimates from the separate solutions. Since we are dealing with uncorrelated data between different techniques we can profit from the property of the addition of normal equations.

Recall that we have introduced the notation BD(MT), BR(MT) and BC(MT) for a block-diagonal, a row, and a column matrix, respectively, having as elements the submatrices MT, for T = V, S, G, D, corresponding to the four space techniques (VLBI, SLR, GPS, DORIS).

The EOP pseudo-observation equations (344) at epoch tk, from a particular space technique T, take the form bq,T,k = qT,k + TT,kδzT,k + eq,T,k. If qk are the EOPs at epoch tk from all techniques, we may connect qT,k from technique T to the total EOPs qk, through qT,k = LT,kqk, where LT,k is a participation matrix. It may be obtained by removing from the identity matrix the rows corresponding to EOPs that are not observed by technique T. Usually LT,k will be the same for all epochs, but different ones may account for missing EOP observations in some epochs. For all epochs the EOPs qT, observed by technique T, are respectively connected to the overall EOPs q, from all techniques, through
$$\displaystyle \begin{aligned} {\mathbf{q}}_T = \left[ \begin{array}{c} \vdots \\ {{\mathbf{q}}_{T,k} } \\ \vdots \end{array} \right] = \left[ \begin{array}{c} \vdots \\ {{\mathbf{L}}_{T,k} {\mathbf{q}}_k } \\ \vdots \end{array} \right] = \left[ \begin{array}{ccc} \ddots & \vdots & \mathbf{0} \\ \cdots & {{\mathbf{L}}_{T,k} } & \cdots \\ \mathbf{0} & \vdots & \ddots \end{array} \right]\left[ \begin{array}{c} \vdots \\ {{\mathbf{q}}_k } \\ \vdots \end{array} \right] = {\mathbf{L}}_T \mathbf{q}. \end{aligned} $$
(353)
LT is a sparse matrix with non-zero quasi-diagonal sub-blocks LT,k.
The coordinate time series data from each technique T are
$$\displaystyle \begin{aligned} {\mathbf{b}}_T = {\mathbf{A}}_{{\mathbf{a}}_T } \delta {\mathbf{a}}_T + {\mathbf{A}}_{{\mathbf{z}}_T } \delta {\mathbf{z}}_T + {\mathbf{e}}_T ,\qquad \quad {\mathbf{e}}_T \sim (\mathbf{0},\sigma^2{\mathbf{P}}_T^{ - 1} ),\qquad T = V,S,G,D, \end{aligned} $$
(354)
The EOP time series data from each technique T are
$$\displaystyle \begin{aligned} {\mathbf{b}}_{q,T} = {\mathbf{q}}_{{}_T } + {\mathbf{T}}_T \delta {\mathbf{z}}_T + {\mathbf{e}}_{q,T} ,\quad {\mathbf{e}}_{q,T} \sim (\mathbf{0},\sigma^2{\mathbf{P}}_{q,T}^{ - 1} ),\qquad \quad T = V,S,G,D, \end{aligned} $$
(355)
where qT is the subset of all EOPs q provided by the technique T.
The data from local ties at the collocation sites have the form
$$\displaystyle \begin{aligned} &{\mathbf{b}}_c = {\mathbf{C}}_{{\mathbf{a}}_V } \delta {\mathbf{a}}_V + {\mathbf{C}}_{{\mathbf{a}}_S } \delta {\mathbf{a}}_S + {\mathbf{C}}_{{\mathbf{a}}_G } \delta {\mathbf{a}}_G + {\mathbf{C}}_{{\mathbf{a}}_D } \delta {\mathbf{a}}_D + {\mathbf{v}}_c = \\ &= [{\begin{array}{cccc} {{\mathbf{C}}_{\mathbf{a}V} } & {{\mathbf{C}}_{{\mathbf{a}}_S } } & {{\mathbf{C}}_{{\mathbf{a}}_G } } & {{\mathbf{C}}_{{\mathbf{a}}_D } } \end{array} }]\left[ \begin{array}{c} {\delta {\mathbf{a}}_V } \\ {\delta {\mathbf{a}}_S } \\ {\delta {\mathbf{a}}_G } \\ {\delta {\mathbf{a}}_D } \end{array} \right] + {\mathbf{v}}_c \equiv \mathbf{C}\delta \mathbf{a} + {\mathbf{v}}_c ,\qquad {\mathbf{v}}_c \sim (\mathbf{0},\sigma^2{\mathbf{P}}_c^{ - 1} ), \end{aligned} $$
(356)
with δa ≡BC(δaT), since there is no significant dependence on the transformation parameters zT.
The indispensable contribution from the tie observations to the normal equations, which must be always included, is given by Ncδa = uc, where
$$\displaystyle \begin{aligned} &{\mathbf{N}}_c \left[ \begin{array}{cccc} {{\mathbf{C}}_{{\mathbf{a}}_V }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_V } } & {{\mathbf{C}}_{{\mathbf{a}}_V }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_S } } & {{\mathbf{C}}_{{\mathbf{a}}_V }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_G } } & {{\mathbf{C}}_{{\mathbf{a}}_V }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_D } } \\ {{\mathbf{C}}_{{\mathbf{a}}_S }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_V } } & {{\mathbf{C}}_{{\mathbf{a}}_S }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_S } } & {{\mathbf{C}}_{{\mathbf{a}}_S }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_G } } & {{\mathbf{C}}_{{\mathbf{a}}_S }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_D } } \\ {{\mathbf{C}}_{{\mathbf{a}}_G }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_V } } & {{\mathbf{C}}_{{\mathbf{a}}_G }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_S } } & {{\mathbf{C}}_{{\mathbf{a}}_G }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_G } } & {{\mathbf{C}}_{{\mathbf{a}}_G }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_D } } \\ {{\mathbf{C}}_{{\mathbf{a}}_D }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_V } } & {{\mathbf{C}}_{{\mathbf{a}}_D }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_S } } & {{\mathbf{C}}_{{\mathbf{a}}_D }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_G } } & {{\mathbf{C}}_{{\mathbf{a}}_D }^T {\mathbf{P}}_c {\mathbf{C}}_{{\mathbf{a}}_D } } \end{array} \right], \end{aligned} $$
(357)
$$\displaystyle \begin{aligned} &{\mathbf{u}}_c = \mathrm{BC}({\mathbf{u}}_{c,T} ) = \mathrm{BC}({\mathbf{C}}_{{\mathbf{a}}_T }^T {\mathbf{P}}_c {\mathbf{b}}_c ). \end{aligned} $$
(358)

16.1 Combination Without Taking EOP Data into Consideration

Let us look first into the approach where EOP time series are not taken into account and the only data are coordinate time series and the tie observations at collocation sites. We are mainly interested in station initial coordinates and velocities, while transformation parameters are nuisance parameters. We can proceed in two ways applying either Propositions 3 or 5. In the first case we can add the reduced normal equations from each technique after the nuisance transformation parameter have been eliminated, as well as the normal equations from the tie observations, in order to produce the joint normal equations as they would result after the elimination of the nuisance parameters. The original normal equations from each technique are
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{N}}_{{\mathbf{a}}_T } } & {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{z}}_T } } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}_T } \\ {\delta \hat{\mathbf{z}}_T } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{{\mathbf{a}}_T } } \\ {{\mathbf{u}}_{{\mathbf{z}}_T } } \end{array} \right]. \end{aligned} $$
(359)
The transformation parameters \(\delta \hat {\mathbf {z}}_T \) can be eliminated by solving the second of the normal equations for
$$\displaystyle \begin{aligned} \delta \hat{\mathbf{z}}_T= {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} {\mathbf{u}}_{{\mathbf{z}}_T } - {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T \delta \hat{\mathbf{a}}_T , \end{aligned} $$
(360)
and replacing in the first one, thus arriving at the reduced normal equations
$$\displaystyle \begin{aligned} \bar{\mathbf{N}}_{{\mathbf{a}}_T } \delta \hat{\mathbf{a}}_T = \bar{\mathbf{u}}_{{\mathbf{a}}_T } ,\qquad \qquad T = V,S,G,D, \end{aligned} $$
(361)
where \(\bar {\mathbf {N}}_{{\mathbf {a}}_T } = {\mathbf {N}}_{{\mathbf {a}}_T } - {\mathbf {N}}_{{\mathbf {a}}_T {\mathbf {z}}_T } {\mathbf {N}}_{{\mathbf {z}}_T }^{ - 1} {\mathbf {N}}_{{\mathbf {a}}_T {\mathbf {z}}_T }^T \) and \(\bar {\mathbf {u}}_{{\mathbf {a}}_T } = {\mathbf {u}}_{{\mathbf {a}}_T } - {\mathbf {N}}_{{\mathbf {a}}_T {\mathbf {z}}_T } {\mathbf {N}}_{{\mathbf {z}}_T }^{ - 1} \bar {\mathbf {u}}_{{\mathbf {z}}_T } \). If these are added to the normal equation for the tie observations, we obtain the reduced joint normal equations
$$\displaystyle \begin{aligned} ({\mathbf{N}}_c + \bar{\mathbf{N}}_{\mathbf{a}} )\delta \hat{\mathbf{a}} = {\mathbf{u}}_c + \bar{\mathbf{u}}_{\mathbf{a}} , \end{aligned} $$
(362)
where \(\bar {\mathbf {N}}_{\mathbf {a}} = \mathrm {BD}(\bar {\mathbf {N}}_{{\mathbf {a}}_T } )\), \(\delta \hat {\mathbf {a}} = \mathrm {BC}(\delta \hat {\mathbf {a}}_T )\), \(\bar {\mathbf {u}}_{\mathbf {a}} = \mathrm {BC}(\bar {\mathbf {u}}_{{\mathbf {a}}_T } )\). These can be solved with the addition of minimal constraints as usual (see chapter 5). The second possible approach, which will produce exactly the same results, is based on the application of Proposition 5. The separate reduced normal equations of each technique (361) are solved with the use of minimal constraints, which may be different in each technique, producing estimates \(\delta \hat {\mathbf {a}}_{V\vert V} \), \(\delta \hat {\mathbf {a}}_{S\vert S} \), \(\delta \hat {\mathbf {a}}_{G\vert G} \), \(\delta \hat {\mathbf {a}}_{D\vert D} \). The notation \(\delta \hat {\mathbf {a}}_{T\vert T} \) emphasizes the fact that these are estimates of δaT based on data from only the space technique T, thus reserving the notation \(\delta \hat {\mathbf {a}}_T \) for the estimates obtained when data from all techniques are combined. These are used as pseudo-observations with weight matrices \({\mathbf {N}}_{{\mathbf {a}}_V } \), \({\mathbf {N}}_{{\mathbf {a}}_S } \), \({\mathbf {N}}_{{\mathbf {a}}_G } \), \({\mathbf {N}}_{{\mathbf {a}}_D } \), respectively, together with the tie observations to produce the reduced joint normal equations
$$\displaystyle \begin{aligned} (\bar{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_c )\delta \hat{\mathbf{a}} = {\mathbf{u}}_c + \bar{\mathbf{N}}_{\mathbf{a}} \mathrm{BC}(\delta \hat{\mathbf{a}}_{T\vert T} ) = {\mathbf{u}}_c + \mathrm{BC}(\bar{\mathbf{N}}_{{\mathbf{a}}_T } \delta \hat{\mathbf{a}}_{T\vert T} ), \end{aligned} $$
(363)
which in view of the relations \(\bar {\mathbf {u}}_{{\mathbf {a}}_T } = \bar {\mathbf {N}}_{{\mathbf {a}}_T } \delta \hat {\mathbf {a}}_{T|T} \) are the same as in the previous approach.
Once the optimal values of the parameters \(\delta \hat {\mathbf {a}}_T \) have been determined by solving the above normal equations using minimal constraints \({\mathbf {C}}_{\mathbf {a}}^T \delta \mathbf {a} = {\mathbf {d}}_{\mathbf {a}} \), the estimates of the transformation parameters follow from
$$\displaystyle \begin{aligned} \delta \hat{\mathbf{z}}_T = {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} ({\mathbf{u}}_{{\mathbf{z}}_T } - {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T \delta \hat{\mathbf{a}}_T ), \end{aligned} $$
(364)
with related covariance factor matrices (see Eqs. 311)
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} = - {\mathbf{Q}}_{\hat{\mathbf{a}}} {\mathbf{N}}_{\mathbf{az}} {\mathbf{N}}_{\mathbf{z}}^{ - 1} , \\ &{\mathbf{Q}}_{\hat{\mathbf{z}}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} {\mathbf{N}}_{\mathbf{az}}^T {\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} .{} \end{aligned} $$
(365)
The EOP data from the four techniques can be combined a posteriori, thus avoiding to allow their questionable nature to influence the estimates of the initial coordinates and velocities included in \(\delta \hat {\mathbf {a}}\). The relevant model is
$$\displaystyle \begin{aligned} \tilde{\mathbf{b}}_{q,T} \equiv {\mathbf{b}}_{q,T} - {\mathbf{T}}_T \delta \hat{\mathbf{z}}_T = {\mathbf{q}}_T + {\mathbf{e}}_{q,T} ,\quad {\mathbf{e}}_{q,T} \sim (\mathbf{0},\sigma^2{\mathbf{P}}_{q,T}^{ - 1} ),\qquad \quad T = V,S,G,D, \end{aligned} $$
(366)
where \(\tilde {\mathbf {b}}_{q,T} \) are the EOP data transformed in the new common for all techniques reference system, utilizing the estimated at the first step values \(\delta \hat {\mathbf {z}}_T \). The correlation between bq,T and \(\delta \hat {\mathbf {z}}_T \), stemming from the one between bq,T and the coordinate time series data bT, is ignored. The same is true for the covariance matrix of \(\delta \hat {\mathbf {z}}_T \), which are treated as constants in the EOP combination. It should be however taken into account in the covariance propagation for the computation of the covariance factor matrix Qq of the EOP estimates \(\hat {\mathbf {q}}\) resulting from the combination. The situation is pretty much similar to the so called stations with weights in classical densification networks, where the officially adopted coordinates of stations from a higher order network were necessarily fixed in the adjustment of the new network, but their uncertainty was taken into account when computing the covariance matrix of the new station coordinates. Even today this approach may be useful for relating a local network to the reference system of the ITRF network (i.e., the ITRS), without altering the coordinates and velocities of the included ITRF stations.
Since not all EOPs q are included in each technique, using the above introduced participation matrix LT, we obtain the observation equations
$$\displaystyle \begin{aligned} \tilde{\mathbf{b}}_{q,T} = {\mathbf{L}}_T \mathbf{q} + {\mathbf{e}}_{q,T} ,\quad {\mathbf{e}}_{q,T} \sim (\mathbf{0},\sigma^2{\mathbf{P}}_{q,T}^{ - 1} ),\qquad \quad T = V,S,G,D, \end{aligned} $$
(367)
The joint observation equations from all techniques are
$$\displaystyle \begin{aligned} \tilde{\mathbf{b}}_q = \mathbf{Lq} + {\mathbf{e}}_q ,\qquad \qquad {\mathbf{e}}_q \sim (\mathbf{0},\sigma^2{\mathbf{P}}_q^{ - 1} ), \end{aligned} $$
(368)
with \(\tilde {\mathbf {b}}_q = \mathrm {BC}(\tilde {\mathbf {b}}_{q,T} )\), L = BC(LT), eq = BC(eq,T) and weight matrix Pq = BD(Pq,T). The corresponding normal equations are
$$\displaystyle \begin{aligned} {\mathbf{N}}_{\mathbf{q}} \hat{\mathbf{q}} = {\mathbf{u}}_{\mathbf{q}} , \end{aligned} $$
(369)
with
$$\displaystyle \begin{aligned} {\mathbf{N}}_{\mathbf{q}} \,{=}\, {\mathbf{L}}^T{\mathbf{P}}_q \mathbf{L} \,{=}\, \sum_{T = V,S,G,D} {{\mathbf{L}}_T^T } {\mathbf{P}}_{q,T} {\mathbf{L}}_T ,\quad {\mathbf{u}}_{\mathbf{q}} \,{=}\, {\mathbf{L}}^T{\mathbf{P}}_q \tilde{\mathbf{b}}_q \,{=}\, \sum_{T = V,S,G,D} {{\mathbf{L}}_T^T } {\mathbf{P}}_{q,T} \tilde{\mathbf{b}}_{q,T}. \end{aligned} $$
(370)
The solution \(\hat {\mathbf {q}} = {\mathbf {N}}_{\mathbf {q}}^{ - 1} {\mathbf {u}}_{\mathbf {q}} \) provides the desired final EOP estimates. The matrix Nq is non-singular as can be easily seen by considering the case where all techniques observe all EOPs in which case LT = I. The corresponding covariance factor matrix, taking into account the uncertainty \({\mathbf {Q}}_{\delta \hat {\mathbf {z}}} \) of the fixed transformation parameters \(\delta \hat {\mathbf {z}}\) is given by
$$\displaystyle \begin{aligned} {\mathbf{Q}}_{\hat{\mathbf{q}}} = {\mathbf{N}}_{\mathbf{q}}^{ - 1} {\mathbf{L}}^T{\mathbf{P}}_q (\mathbf{I} + \mathbf{TQ}_{\delta \hat{\mathbf{z}}} {\mathbf{T}}^T{\mathbf{P}}_q )\mathbf{LN}_{\mathbf{q}}^{ - 1} , \end{aligned} $$
(371)
where T = BD(TT).
If desired a more rigorous approach can be followed, taking into account the correlation between bq,T and \(\delta \hat {\mathbf {z}}_T \), which follows from the correlation between bq,T and bT. We will not pursue this approach since we have already have departed from strict rigor for the already explained reasons. One of the advantages of this two-step approach is that in the second step of the EOP combination one can model weekly estimates of EOP values and their velocities as functions of the within the week daily EOP parameters. This calls for an interpolation scheme of the daily values and the expression of the weakly value as an average of the interpolated function. For example, a direct linear regression \(q_i = q(t_i ) = q_k^W + (t_i - t_k )\dot {q}_k^W \), of the within the week daily values qk−3, …, qk, …, qk−3, allows to express the weakly values \(q_k^W \) and their velocities \(\dot {q}_k^W \), visualized as assigned to the week mid-day epoch tk, as functions
$$\displaystyle \begin{aligned} q_k^W &= \frac{1}{7}q_{k - 3} + \frac{1}{7}q_{k - 2} + \frac{1}{7}q_{k - 1} + \frac{1}{7}q_k + \frac{1}{7}q_{k + 1} + \frac{1}{7}q_{k + 2} + \frac{1}{7}q_{k - 3} , \end{aligned} $$
(372)
$$\displaystyle \begin{aligned} \dot{q}_k^W &= - \frac{3}{28h}q_{k - 3} + \frac{2}{28h}q_{k - 2} - \frac{1}{28h}q_{k - 1} + 0q_k + \frac{1}{28h}q_{k + 1}+\\ &\quad + \frac{2}{28h}q_{k + 2} + \frac{3}{28h}q_{k + 3} , \end{aligned} $$
(373)
where h is the length of one day in the adopted units. Note that in this case the matrices LT are no more “inflation” matrices with elements one or zero, but proper design matrices as derived by interpolation models as the one above.

16.2 Combination Including EOP Data

When EOP coordinate series are also implemented, we can still use reduced normal equations where transformation parameters have been eliminated. The EOP parameters are no more non-adjustable parameters as in the case of the stacking of data from a single technique, because they appear as unknowns in the data of all the techniques. From the correlated observation equations (354) and (355), or in matrix form
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {{\mathbf{b}}_T } \\ {{\mathbf{b}}_{q,T} } \end{array} \right] = \left[ \begin{array}{ccc} {{\mathbf{A}}_{{\mathbf{a}}_T } } & \mathbf{0} & {{\mathbf{A}}_{{\mathbf{z}}_T } } \\ \mathbf{0} & \mathbf{I} & {\mathbf{T}}_T \end{array} \right]\left[ \begin{array}{c} {\delta {\mathbf{a}}_T } \\ {{\mathbf{q}}_T } \\ {\delta {\mathbf{z}}_T } \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{e}}_T } \\ {{\mathbf{e}}_{q,T} } \end{array} \right], \end{aligned} $$
(374)
and their corresponding weight matrix \(\left [ \begin {array}{cc} {{\mathbf {P}}_{b_T } } & {{\mathbf {P}}_{b_T q_T } } \\ {{\mathbf {P}}_{b_T q_T }^T } & {{\mathbf {P}}_{q_T } } \end {array} \right ]\), it follows that the normal equations for the correlated coordinate and EOP time series data of each technique T = V, S, G, D, have the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{N}}_{\delta {\mathbf{a}}_T } } & {{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } } & {{\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T } } & {{\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{\delta {\mathbf{z}}_T } } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}_{T\vert T} } \\ {\hat{\mathbf{q}}_{T\vert T} } \\ {\delta \hat{\mathbf{z}}_{T\vert T} } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{\delta {\mathbf{a}}_T } } \\ {{\mathbf{u}}_{{\mathbf{q}}_T } } \\ {{\mathbf{u}}_{\delta {\mathbf{z}}_T } } \end{array} \right]. \end{aligned} $$
(375)
where
$$\displaystyle \begin{aligned} &{\mathbf{N}}_{\delta {\mathbf{a}}_T } = {\mathbf{A}}_{{\mathbf{a}}_T }^T {\mathbf{P}}_{b_T } {\mathbf{A}}_{{\mathbf{a}}_T } ,\qquad \quad {\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } = {\mathbf{A}}_{{\mathbf{a}}_T }^T {\mathbf{P}}_{b_T q_T } , \\ &{\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T } = {\mathbf{A}}_{{\mathbf{a}}_T }^T ({\mathbf{P}}_{b_T } {\mathbf{A}}_{{\mathbf{z}}_T } + {\mathbf{P}}_{b_T q_T } \mathbf{T}), \\ &{\mathbf{N}}_{{\mathbf{q}}_T } = {\mathbf{P}}_{q_T } ,\qquad \qquad \qquad \ \, \, {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T } = {\mathbf{P}}_{b_T q_T }^T {\mathbf{A}}_{{\mathbf{z}}_T } + {\mathbf{P}}_{q_T } {\mathbf{T}}_T, \\ {} &{\mathbf{N}}_{\delta {\mathbf{z}}_T } = ({\mathbf{A}}_{{\mathbf{z}}_T }^T {\mathbf{P}}_{b_T } + {\mathbf{T}}^T_T{\mathbf{P}}_{b_T q_T }^T ){\mathbf{A}}_{{\mathbf{z}}_T } + ({\mathbf{A}}_{{\mathbf{z}}_T }^T {\mathbf{P}}_{b_T q_T } + {\mathbf{T}}^T_T{\mathbf{P}}_{q_T } ){\mathbf{T}}_T, \end{aligned} $$
(376)
$$\displaystyle \begin{aligned} &{\mathbf{u}}_{\delta {\mathbf{a}}_T } = {\mathbf{A}}_{{\mathbf{a}}_T }^T {\mathbf{P}}_{b_T } {\mathbf{b}}_T + {\mathbf{A}}_{{\mathbf{a}}_T }^T {\mathbf{P}}_{b_T q_T } {\mathbf{b}}_{q,T} , \\ &{\mathbf{u}}_{{\mathbf{q}}_T } = {\mathbf{P}}_{b_T q_T }^T {\mathbf{b}}_T + {\mathbf{P}}_{q_T } {\mathbf{b}}_{q,T} , \\ {} &{\mathbf{u}}_{\delta {\mathbf{z}}_T } = ({\mathbf{A}}_{{\mathbf{z}}_T }^T {\mathbf{P}}_{b_T } + {\mathbf{T}}^T_T{\mathbf{P}}_{b_T q_T }^T ){\mathbf{b}}_T + ({\mathbf{A}}_{{\mathbf{z}}_T }^T {\mathbf{P}}_{b_T q_T } + {\mathbf{T}}^T_T{\mathbf{P}}_{q_{T}} ){\mathbf{b}}_{q,T} . \end{aligned} $$
(377)
Here \(\delta \hat {\mathbf {a}}_{T\vert T} \), \(\delta \hat {\mathbf {z}}_{T\vert T} \), \(\hat {\mathbf {q}}_{T\vert T} \) denote the estimates of δaT, δzT, qT, respectively, obtained using only data from technique T, in order to distinguish from the respective estimates \(\delta \hat {\mathbf {a}}_T \), \(\delta \hat {\mathbf {z}}_T \), \(\hat {\mathbf {q}}\), obtained using data from all techniques. Notice however that while δaT, δzT are completely different parameters in each technique T, the EOP parameters qT are a subset of the EOP parameters q covered by all techniques. Thus \(\hat {\mathbf {q}}_{T\vert T} \) are separate estimates of qT, while \(\hat {\mathbf {q}}\) is the joint estimate of q.
In order to apply the addition of the normal equations we must take into account that qT = LTq, in which case the normal equations (375) are “inflated” into
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{N}}_{\delta {\mathbf{a}}_T } } & {{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_T } & {{\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T } } \\ {{\mathbf{L}}_T^T {\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{L}}_T^T {\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{L}}_T } & {{\mathbf{L}}_T^T {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T }^T {\mathbf{L}}_T } & {{\mathbf{N}}_{\delta {\mathbf{z}}_T } } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}_{T\vert T} } \\ {\hat{\mathbf{q}}_{\vert T} } \\ {\delta \hat{\mathbf{z}}_{T\vert T} } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{\delta {\mathbf{a}}_T } } \\ {{\mathbf{L}}_T^T {\mathbf{u}}_{{\mathbf{q}}_T } } \\ {{\mathbf{u}}_{\delta {\mathbf{z}}_T } } \end{array} \right]. \end{aligned} $$
(378)
This simply means that \({\mathbf {N}}_{\delta {\mathbf {a}}_T {\mathbf {q}}_T } {\mathbf {L}}_T \), is formed from \({\mathbf {N}}_{\delta {\mathbf {a}}_T {\mathbf {q}}_T } \) by inserting zero columns at the slots for the missing EOPs, \({\mathbf {L}}_T^T {\mathbf {N}}_{{\mathbf {q}}_{{ }_T } \delta {\mathbf {z}}_{{ }_T } } \) is formed from \({\mathbf {N}}_{{\mathbf {q}}_T \delta {\mathbf {z}}_T } \) by inserting zero rows at the slots for the missing EOPs, while \({\mathbf {L}}_T^T {\mathbf {N}}_{{\mathbf {q}}_T } {\mathbf {L}}_T \) is formed from \({\mathbf {N}}_{{\mathbf {q}}_T } \) by inserting both zero columns and zero rows in the same slots. The addition of the normal equations results into
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{N}}_{\delta \mathbf{a} } + {\mathbf{N}}_c } & {{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}} } & {{\mathbf{N}}_{\delta \mathbf{a},\delta \mathbf{z}} } \\ {{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}}^T } & {{\mathbf{N}}_{\mathbf{q}} } & {{\mathbf{N}}_{\mathbf{q},\delta \mathbf{z}} } \\ {{\mathbf{N}}_{\delta \mathbf{a},\delta \mathbf{z}}^T } & {{\mathbf{N}}_{\mathbf{q},\delta \mathbf{z}}^T } & {{\mathbf{N}}_{\delta \mathbf{z}} } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}} \\ \hat{\mathbf{q}} \\ {\delta \hat{\mathbf{z}}} \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{\delta \mathbf{a} } + {\mathbf{u}}_c } \\ {{\mathbf{u}}_{\mathbf{q}} } \\ {{\mathbf{u}}_{\delta \mathbf{z}} } \end{array} \right], \end{aligned} $$
(379)
where
$$\displaystyle \begin{aligned} &\delta \hat{\mathbf{a}} = \mathrm{BC}(\delta \hat{\mathbf{a}}_T ),\qquad \delta \hat{\mathbf{z}} = \mathrm{BC}(\delta \hat{\mathbf{z}}_T ), \\ &{\mathbf{u}}_{\delta \mathbf{a}} = \mathrm{BC}({\mathbf{u}}_{\delta {\mathbf{a}}_T } ),\quad {\mathbf{u}}_{\delta \mathbf{z}} = \mathrm{BC}({\mathbf{u}}_{\delta {\mathbf{z}}_T } ), \\ &{\mathbf{N}}_{\delta \mathbf{a}} = \mathrm{BD}({\mathbf{N}}_{\delta {\mathbf{a}}_T } ),\quad {\mathbf{N}}_{\delta \mathbf{a},\delta \mathbf{z}} = \mathrm{BD}({\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T } ),\qquad {\mathbf{N}}_{\delta \mathbf{z}} = \mathrm{BD}({\mathbf{N}}_{\delta {\mathbf{z}}_T } ) \\ &{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}} = \mathrm{BC}({\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_{T}),\qquad \qquad {\mathbf{N}}_{\mathbf{q},\delta \mathbf{z}} = \mathrm{BR}({\mathbf{L}}_T^{T} {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T } ), \\ {} &{\mathbf{N}}_{\mathbf{q}} = \sum_{T = V,S,G,D} {{\mathbf{L}}_T^T } {\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{L}}_T ,\qquad \quad {\mathbf{u}}_{\mathbf{q}} = \sum_{T = V,S,G,D} {{\mathbf{L}}_T^T } {\mathbf{u}}_{{\mathbf{q}}_T } . \end{aligned} $$
(380)
Since the transformation parameters are nuisance parameters it is convenient to eliminate them from the normal equations, following either Propositions 3 or 5.
If Proposition 3 is followed we must eliminate δzT from the normal equations (375) before inflation and then add the resulting reduced normal equations inflated, plus the ones from the tie observations. The reduced equations become
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } } & {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } } \\ {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T }^T } & {\bar{\mathbf{N}}_{{\mathbf{q}}_T } } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}_{T\vert T} } \\ {\hat{\mathbf{q}}_{T\vert T} } \end{array} \right] = \left[ \begin{array}{c} {\bar{\mathbf{u}}_{\delta {\mathbf{a}}_T } } \\ {\bar{\mathbf{u}}_{{\mathbf{q}}_T } } \end{array} \right], \end{aligned} $$
(381)
where
$$\displaystyle \begin{aligned} &\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } = {\mathbf{N}}_{\delta {\mathbf{a}}_T } - {\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T } {\mathbf{N}}_{\delta {\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T }^T ,\quad \bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } {=} {\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } - {\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T } {\mathbf{N}}_{\delta {\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T }^T , \\ &\bar{\mathbf{N}}_{{\mathbf{q}}_T } = {\mathbf{N}}_{{\mathbf{q}}_T } - {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T } {\mathbf{N}}_{\delta {\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T }^T , \\ {} &\bar{\mathbf{u}}_{\delta {\mathbf{a}}_T } = {\mathbf{u}}_{\delta {\mathbf{a}}_T } - {\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T } {\mathbf{N}}_{\delta {\mathbf{z}}_T }^{ - 1} {\mathbf{u}}_{\delta {\mathbf{z}}_T } ,\qquad \bar{\mathbf{u}}_{{\mathbf{q}}_T } = {\mathbf{u}}_{{\mathbf{q}}_T } - {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T } {\mathbf{N}}_{\delta {\mathbf{z}}_T }^{ - 1} {\mathbf{u}}_{\delta {\mathbf{z}}_T } . \end{aligned} $$
(382)
The reduced equations (381) must be first inflated into
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } } & {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_T } \\ {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{L}}_T } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}_{T\vert T} } \\ {\hat{\mathbf{q}}_{\vert T} } \end{array} \right] = \left[ \begin{array}{c} {\bar{\mathbf{u}}_{\delta {\mathbf{a}}_T } } \\ {{\mathbf{L}}_T^T \bar{\mathbf{u}}_{{\mathbf{q}}_T } } \end{array} \right], \end{aligned} $$
(383)
and then added together with the ones from tie observations, to produce the joint reduced normal equations
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\delta \mathbf{a}} + {\mathbf{N}}_c } & {\bar{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}} } \\ {\bar{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}}^T } & {\bar{\mathbf{N}}_{\mathbf{q}} } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}} \\ \hat{\mathbf{q}} \end{array} \right] = \left[ \begin{array}{c} {\bar{\mathbf{u}}_{\delta \mathbf{a}} + {\mathbf{u}}_c } \\ {\bar{\mathbf{u}}_{\mathbf{q}} } \end{array} \right], \end{aligned} $$
(384)
where
$$\displaystyle \begin{aligned} &\bar{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}} = \mathrm{BC}(\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_T ), \;\;\quad \qquad \bar{\mathbf{N}}_{\delta \mathbf{a}} = \mathrm{BD}(\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } ),\\ {} &\bar{\mathbf{N}}_{\mathbf{q}} = \sum_{T = V,S,G,D} {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } } {\mathbf{L}}_T , \;\qquad \bar{\mathbf{u}}_{\mathbf{q}} = \sum_{T = V,S,G,D} {{\mathbf{L}}_T^T \bar{\mathbf{u}}_{{\mathbf{q}}_T } } . \end{aligned} $$
(385)
If desired the estimates of the transformation parameters can be derived according to Eq. (277), which in this case becomes
$$\displaystyle \begin{aligned} \delta \hat{\mathbf{z}}_T = {\mathbf{N}}_{\delta \hat{\mathbf{z}}_T }^{ - 1} ({\mathbf{u}}_{\delta {\mathbf{z}}_T } - {\mathbf{N}}_{\delta {\mathbf{a}}_T \delta {\mathbf{z}}_T }^T \delta \hat{\mathbf{a}}_T - {\mathbf{N}}_{{\mathbf{q}}_T \delta {\mathbf{z}}_T }^T {\mathbf{L}}_T \hat{\mathbf{q}}). \end{aligned} $$
(386)
The related covariance matrices can be sequentially computed (see Eqs. 312) as
$$\displaystyle \begin{aligned} &{\mathbf{Q}}_{\delta \hat{\mathbf{a}},\delta \hat{\mathbf{z}}} = - ({\mathbf{Q}}_{\delta \hat{\mathbf{a}}} {\mathbf{N}}_{\delta \hat{\mathbf{a}},\delta \hat{\mathbf{z}}} + {\mathbf{Q}}_{\delta \hat{\mathbf{a}},\hat{\mathbf{q}}} {\mathbf{N}}_{\hat{\mathbf{q}},\delta _{\hat{\mathbf{z}}}} ){\mathbf{N}}_{\delta \hat{\mathbf{z}}}^{ - 1} , \\ &{\mathbf{Q}}_{\hat{\mathbf{q}},\delta \hat{\mathbf{z}}} = - ({\mathbf{Q}}_{\delta \hat{\mathbf{a}},\hat{\mathbf{q}}}^T {\mathbf{N}}_{\delta \hat{\mathbf{a}},\delta \hat{\mathbf{z}}} + {\mathbf{Q}}_{\hat{\mathbf{q}}} {\mathbf{N}}_{\hat{\mathbf{q}},\delta \hat{\mathbf{z}}} ){\mathbf{N}}_{\delta \hat{\mathbf{z}}}^{ - 1} , \\ {} &{\mathbf{Q}}_{\delta \hat{\mathbf{z}}} = {\mathbf{N}}_{\delta \hat{\mathbf{z}}}^{ - 1} - {\mathbf{N}}_{\delta \hat{\mathbf{z}}}^{ - 1} ({\mathbf{N}}_{\delta \hat{\mathbf{a}},\delta \hat{\mathbf{z}}}^T {\mathbf{Q}}_{\delta \hat{\mathbf{a}},\delta \hat{\mathbf{z}}} + {\mathbf{N}}_{\hat{\mathbf{q}},\delta \hat{\mathbf{z}}}^{ T} {\mathbf{Q}}_{\hat{\mathbf{q}},\delta \hat{\mathbf{z}}} ). \end{aligned} $$
(387)
Much more simple is the approach suggested by Proposition 5. The separate estimates \(\delta \hat {\mathbf {a}}_{T\vert T} \), \(\hat {\mathbf {q}}_{T\vert T} \), from the reduced normal equations of each technique (after eliminating the transformation parameters δzT) can be used as pseudo-observations, with the reduced normal coefficients matrices as weights. The normal equations of each technique have been already given by (381) and (382). The pseudo-observation equations from each technique T are therefore
$$\displaystyle \begin{aligned} &\delta \hat{\mathbf{a}}_{T\vert T} = \delta {\mathbf{a}}_T + {\mathbf{e}}_{\delta \hat{\mathbf{a}}_{T\vert T} } , \end{aligned} $$
(388)
$$\displaystyle \begin{aligned} &\hat{\mathbf{q}}_{T\vert T} = {\mathbf{q}}_T + {\mathbf{e}}_{\hat{\mathbf{q}}_{T\vert T} } = {\mathbf{L}}_T \mathbf{q} + {\mathbf{e}}_{\hat{\mathbf{q}}_{T\vert T} } , \end{aligned} $$
(389)
with weight matrix \(\left [ \begin {array}{cc} {\bar {\mathbf {N}}_{\delta {\mathbf {a}}_T } } & {\bar {\mathbf {N}}_{\delta {\mathbf {a}}_T {\mathbf {q}}_T } } \\ {\bar {\mathbf {N}}_{\delta {\mathbf {a}}_T {\mathbf {q}}_T }^T } & {\bar {\mathbf {N}}_{{\mathbf {q}}_T } } \end {array} \right ]\). Accordingly, their contribution to the joint normal equations is
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } } & {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_T } \\ {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{L}}_T } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}_T } \\ \hat{\mathbf{q}} \end{array} \right] = \left[ \begin{array}{c} {\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } \delta \hat{\mathbf{a}}_{T\vert T} + \bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T} } \\ {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T }^T \delta \hat{\mathbf{a}}_{T\vert T} + {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T} } \end{array} \right]. \end{aligned} $$
(390)
Adding the above contributions from all techniques and the contribution from the tie observations, the reduced joint normal equations become
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{\delta \mathbf{a}} } & {\bar{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}} } \\ {\bar{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}}^T } & {\bar{\mathbf{N}}_{\mathbf{q}} } \end{array} \right]\left[ \begin{array}{c} {\delta \hat{\mathbf{a}}} \\ \hat{\mathbf{q}} \end{array} \right] = \left[ \begin{array}{c} {\tilde{\mathbf{u}}_{\delta \mathbf{a}} } \\ {\tilde{\mathbf{u}}_{\mathbf{q}} } \end{array} \right], \end{aligned} $$
(391)
where
$$\displaystyle \begin{aligned} &\delta \hat{\mathbf{a}} = \mathrm{BC}(\delta \hat{\mathbf{a}}_T ),\qquad \bar{\mathbf{N}}_{\delta \mathbf{a}} = \mathrm{BD}(\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } ),\qquad \qquad \qquad \bar{\mathbf{N}}_{\delta \mathbf{a},\mathbf{q}} = \mathrm{BC}(\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_T ), \\ &\bar{\mathbf{N}}_{\mathbf{q}} = \sum_T {{\mathbf{L}}_T^T } \bar{\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{L}}_T , \\ {} &\tilde{\mathbf{u}}_{\delta \mathbf{a}} \,{=}\, \mathrm{BC}(\bar{\mathbf{N}}_{\delta {\mathbf{a}}_T } \delta \hat{\mathbf{a}}_{T\vert T} \,{+}\, \bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T}) ,\quad \tilde{\mathbf{u}}_{\mathbf{q}} {=} \sum_T {({\mathbf{L}}_T^T \bar{\mathbf{N}}_{\delta {\mathbf{a}}_T {\mathbf{q}}_T }^T \delta \hat{\mathbf{a}}_{T\vert T} \,{+}\, {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T} ).} \end{aligned} $$
(392)
The advantage of this approach is that we may obtain the per technique estimates \(\delta \hat {\mathbf {a}}_{T\vert T} \), \(\hat {\mathbf {q}}_{T\vert T} \) in any convenient way and not necessarily by solving (through minimal constraints) the reduced normal equations (381). Thus we may take advantage that, for a single technique, the EOP time series are non-adjustable parameters and proceed as described in Sects. 13.3 and 15, utilizing a reduced weight matrix.

17 ITRF Formulation: The Combination of Separate Estimates from Space Techniques in the Case of Non-singular Covariance Matrices

We have seen in the previous chapter that estimates from the separate stackings of each space techniques can be used as direct pseudo-observations of the corresponding unknowns ignoring completely the fact that the estimates refer to a different reference system for each technique, which is also different from the reference system to which the ITRF parameters will finally refer. This seemingly arbitrary assumption is legitimate, provided that one uses as weight matrices the coefficient matrices of the normal equation of each technique and (most important) that these matrices have been rigorously computed on the basis of the data at hand, and thus share the rank deficiencies due to the lack of reference system information in the performed observations. When this is not the case and the normal equation matrices have no rank defects at all or have rank defects different from those assumed on a theoretical basis, one must include in the data analysis model transformation parameters connecting the spatiotemporal reference system of each space technique to that the ITRF. The same holds true if, for some reason, positive-definite weight matrices are used other than the coefficient matrices of the per technique normal equations.

In the second combination step the input data are the estimates obtained from the stacking of each technique T = V, S, G, D, namely the initial coordinates and velocities \(\hat {\mathbf {a}}_{T\vert T} = \left [ \begin {array}{c} {\delta \hat {\mathbf {x}}_{T\vert T} } \\ {\delta \hat {\mathbf {v}}_{T\vert T} } \end {array} \right ]\) and the stacked EOPs \(\hat {\mathbf {q}}_{T\vert T} \), containing the time series \(\hat {\mathbf {q}}_{k\vert T} \), for all epochs tk, k = 1, 2, …, m. Each of these refer to a proper reference system established in the first step through the use of minimal constraints. In order to relate them to the corresponding combined estimates \(\hat {\mathbf {a}}_T \), \(\hat {\mathbf {q}}_k \), referring in a final common ITRF reference system, we need the transformation laws from one reference system to the other. In principle the transformation to the technique T coordinates xi|T(t) from those of the ITRF xi(t) has the general form
$$\displaystyle \begin{aligned} {\mathbf{x}}_{i\vert T} (t) & = [1 + s_T (t)]\mathbf{R}(\boldsymbol{\uptheta }_T (t)){\mathbf{x}}_i (t) + {\mathbf{d}}_T (t)\approx\\ &\approx {\mathbf{x}}_i (t) + s_T (t){\mathbf{x}}_i^{ap} (t) + [{\mathbf{x}}_i^{ap} (t)\times ]\boldsymbol{\uptheta }_T (t) + {\mathbf{d}}_T (t), \end{aligned} $$
(393)
with \({\mathbf {x}}_i^{ap} (t) = {\mathbf {x}}_{0i}^{ap} + (t - t_0 ){\mathbf {v}}_i^{ap} \), assuming common approximate values, where one may in principle incorporate arbitrary continuous transformation functions θT(t), dT(t), sT(t). The need however to maintain the linear-in-time model, at least approximately, necessitates the restriction to linear functions Open image in new window, \({\mathbf {d}}_T (t) = {\mathbf {d}}_{0T} + (t - t_0 )\dot {\mathbf {d}}_T \), \(s_T (t) = s_{0T} + (t - t_0 )\dot {s}_T \). In this case the (tt0)2 terms are negligibly small and the transformation law for initial coordinates and velocities takes the familiar form,
$$\displaystyle \begin{aligned} &\hat{\mathbf{x}}_{0i,T\vert T} = {\mathbf{x}}_{0i} + s_{0T} {\mathbf{x}}_{0i}^{ap} + [{\mathbf{x}}_{0i}^{ap} \times ]\boldsymbol{\uptheta }_{0T} + {\mathbf{d}}_{0T} + {\mathbf{e}}_{{\mathbf{x}}_{0i,T} } = {\mathbf{x}}_{0i} + {\mathbf{E}}_i {\mathbf{p}}_{0T} + {\mathbf{e}}_{{\mathbf{x}}_{0i,T} } , \\ {} &\hat{\mathbf{v}}_{i,T\vert T} = {\mathbf{v}}_{0i} + \dot{s}_T {\mathbf{x}}_{0i}^{ap} + [{\mathbf{x}}_{0i}^{ap} \times ]\boldsymbol{\dot{\theta }}_T + \dot{\mathbf{d}}_T + {\mathbf{e}}_{{\mathbf{v}}_{iT} } = {\mathbf{v}}_{0i} + {\mathbf{E}}_i \dot{\mathbf{p}}_T + {\mathbf{e}}_{{\mathbf{v}}_{iT} } , \end{aligned} $$
(394)
or collectively, \(\hat {\mathbf {x}}_{0T\vert T} = \hat {\mathbf {x}}_{0T} + \mathbf {Ep}_{0T} +{\mathbf {e}}_{{\mathbf {x}}_{0T}}\), \(\hat {\mathbf {v}}_{T\vert T} = \hat {\mathbf {v}}_T + \mathbf {E}\dot {\mathbf {p}}_T +{\mathbf {e}}_{{\mathbf {v}}_{T}}\), and jointly
$$\displaystyle \begin{aligned} \hat{\mathbf{a}}_{T\vert T} & = \left[ \begin{array}{c} {\hat{\mathbf{x}}_{0T\vert T} } \\ {\hat{\mathbf{v}}_{T\vert T} } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{x}}_{0T} + \mathbf{Ep}_{0T} + {\mathbf{e}}_{{\mathbf{x}}_{0T} } } \\ {{\mathbf{v}}_T + \mathbf{E}\dot{\mathbf{p}}_T + {\mathbf{e}}_{{\mathbf{v}}_{T} }} \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{x}}_{0T} } \\ {{\mathbf{v}}_T } \end{array} \right] + \left[ \begin{array}{cc} \mathbf{E} & \mathbf{0} \\ \mathbf{0} & \mathbf{E} \end{array} \right]\left[ \begin{array}{c} {{\mathbf{p}}_{0T} } \\ {\dot{\mathbf{p}}_T } \end{array} \right]+\\ &\quad + \left[ \begin{array}{c} {{\mathbf{e}}_{{\mathbf{x}}_{0T} } } \\ {{\mathbf{e}}_{{\mathbf{v}}_{T} } } \end{array} \right] \equiv {\mathbf{a}}_{T} + {\mathbf{E}}_{\mathbf{a}} {\mathbf{p}}_{T} + {\mathbf{e}}_{{\mathbf{a}}_T .} \end{aligned} $$
(395)
The transformation parameters zk,T from the reference system of the technique T to the one of epoch tk, must be related to the corresponding values zk,T from the final system of the ITRF to that of epoch tk.
As we have already seen (see Table 1) the transformation of EOP time series under a change of the reference system has the form \({{\mathbf {q}}^{\prime }_{k}} = {\mathbf {q}}_{k,T} + {\mathbf {T}}_0 \mathbf {p}(t_k )\), which in the present case where \({\mathbf {p}}_T (t_{k}) = {\mathbf {p}}_{0T} + (t_k - t_0 )\dot {\mathbf {p}}_T \) takes the form \({{\mathbf {q}}^{\prime }_{k}} = {\mathbf {q}}_{k,T} +\)\({\mathbf {T}}_0 {\mathbf {p}}_{0T} + (t_k - t_0 ){\mathbf {T}}_0 \dot {\mathbf {p}}_T \). Here \({{\mathbf {q}}^{\prime }_{k}} \) are the EOPs of the epoch tk, expressed in the reference system of each technique, and qk the same quantities expressed in the final ITRF reference system. If derivatives of EOPs have been in some way already transformed in the spatiotemporal system of the separate stacking of each technique, they will obviously transform according to \({\hat {\mathbf {q}}^{\prime }}_k = \dot {\mathbf {q}}_k + {\mathbf {T}}_0 \dot {\mathbf {p}}(t_k ) = \dot {\mathbf {q}}_k + {\mathbf {T}}_0 \dot {\mathbf {p}}_T \). Collectively, the estimates of the stacking of each technique \(\hat {\mathbf {q}}_{T\vert T} \) for all epochs, can be expressed as functions of the same subset of the EOPs qT through
$$\displaystyle \begin{aligned} \hat{\mathbf{q}}_{T\vert T} & = \left[ \begin{array}{c} \vdots \\ {\hat{\mathbf{q}}_{k\vert T} + {\mathbf{e}}_{{\mathbf{q}}_{k,T} } } \\ \vdots \\ - \\ \vdots \\ {\hat{\dot{\mathbf{q}}}_{k\vert T} + {\mathbf{e}}_{\dot{\mathbf{q}}_{k,T} } } \\ \vdots \end{array} \right] = \left[ \begin{array}{c} \vdots \\ {{\mathbf{q}}_k } \\ \vdots \\ - \\ \vdots \\ {\dot{\mathbf{q}}_k } \\ \vdots \end{array} \right] + \left[ \begin{array}{cc} \vdots & \vdots \\ {{\mathbf{T}}_0 } & {(t_k - t_0 ){\mathbf{T}}_0 } \\ \vdots & \vdots \\ - & - \\ \vdots & \vdots \\ \mathbf{0} & {{\mathbf{T}}_0 } \\ \vdots & \vdots \end{array} \right] \left[ \begin{array}{c} {{\mathbf{p}}_{0T} } \\ {\dot{\mathbf{p}}_T } \end{array} \right] + \left[ \begin{array}{c} \vdots \\ {{\mathbf{e}}_{{\mathbf{q}}_{k,T} } } \\ \vdots \\ - \\ \vdots \\ {{\mathbf{e}}_{\dot{\mathbf{q}}_{k,T} } } \\ \vdots \end{array} \right]=\\ & = {\mathbf{q}}_T + {\mathbf{E}}_{\mathbf{q}} {\mathbf{p}}_T + {\mathbf{e}}_{{\mathbf{q}}_T } ,\quad T,R = V,S,G,D. \end{aligned} $$
(396)
Here qT is the subset of all EOP data that are contained in technique T, expressed in the ITRF reference system, while \(\hat {\mathbf {q}}_{T\vert T} \) are the estimates of qT provided by the stacking of technique T only, expressed in the corresponding reference system. As we have seen the relation to all EOP time series q obtained from one or more techniques is expressed by qT = LTq, where LT is the already introduced participation matrix.
The transformation parameters zk,T transform the reference system of a specific technique T to the one at epoch tk. The transformation parameters pT(t) = \({\mathbf {p}}_{0,T} + (t - t_0 )\dot {\mathbf {p}}_T \) transform the reference system of the ITRF to that of a specific technique T. Therefore the transformation parameters zk which will transform the reference system of the ITRF to that of epoch tk will be given by zk,T|T = zk + pT(tk) = \({\mathbf {z}}_k + {\mathbf {p}}_{0,T} + (t_k - t_0 )\dot {\mathbf {p}}_T \),
$$\displaystyle \begin{aligned} \hat{\mathbf{z}}_{k,T\vert T} = {\mathbf{z}}_k + {\mathbf{p}}_T (t_k ) + {\mathbf{e}}_{\hat{\mathbf{z}}_{k,T\vert T}} = {\mathbf{z}}_k + {\mathbf{p}}_{0,T} + (t_k - t_0 )\dot{\mathbf{p}}_T + {\mathbf{e}}_{\hat{\mathbf{z}}_{k,T\vert T}}, \end{aligned} $$
(397)
or jointly
$$\displaystyle \begin{aligned} \hat{\mathbf{z}}_{T\vert T} = \left[ \begin{array}{c} \vdots \\ {\hat{\mathbf{z}}_{k,T\vert T} } \\ \vdots \end{array} \right] = \left[ \begin{array}{c} \vdots \\ {{\mathbf{z}}_k } \\ \vdots \end{array} \right] + \left[ \begin{array}{cc} \vdots & \vdots \\ \mathbf{I} & {(t_k - t_0 )\mathbf{I}} \\ \vdots & \vdots \end{array} \right]\left[ \begin{array}{c} {{\mathbf{p}}_{0,T} } \\ {\dot{\mathbf{p}}_T } \end{array} \right] \equiv {\mathbf{z}}_T + {\mathbf{E}}_{\mathbf{z}} {\mathbf{p}}_T + {\mathbf{e}}_{{\mathbf{z}}_T .} \end{aligned} $$
(398)
The total pseudo-observation equations are
$$\displaystyle \begin{aligned} {\mathbf{b}}_T &= \hat{\mathbf{x}}_{T\vert T} = \left[ \begin{array}{c} {\hat{\mathbf{a}}_{T\vert T} } \\ {\hat{\mathbf{q}}_{T\vert T} } \\ {\hat{\mathbf{z}}_{T\vert T} } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{a}}_T + {\mathbf{E}}_{\mathbf{a}} {\mathbf{p}}_T + {\mathbf{e}}_{{\mathbf{a}}_T } } \\ {{\mathbf{L}}_T \mathbf{q} + {\mathbf{E}}_{\mathbf{q}} {\mathbf{p}}_T + {\mathbf{e}}_{{\mathbf{q}}_T } } \\ {{\mathbf{z}}_T + {\mathbf{E}}_{\mathbf{z}} {\mathbf{p}}_T + {\mathbf{e}}_{{\mathbf{z}}_T } } \end{array} \right] =\\ &=\left[ \begin{array}{ccc} \mathbf{I} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & {{\mathbf{L}}_T } & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{I} \end{array} \right]\left[ \begin{array}{c} {\mathbf{a}}_{T} \\ \mathbf{q} \\ {{\mathbf{z}}_T } \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{E}}_{\mathbf{a}} } \\ {{\mathbf{E}}_{\mathbf{q}} } \\ {{\mathbf{E}}_{\mathbf{z}} } \end{array} \right]{\mathbf{p}}_T + \left[ \begin{array}{c} {{\mathbf{e}}_{{\mathbf{a}}_T } } \\ {{\mathbf{e}}_{{\mathbf{q}}_T } } \\ {{\mathbf{e}}_{{\mathbf{z}}_T } } \end{array} \right] =\\ &= {\mathbf{A}}_T {\mathbf{x}}_T + {\mathbf{E}}_T {\mathbf{p}}_T + {\mathbf{e}}_T ,\qquad T = V,S,G,D. \end{aligned} $$
(399)
where \({\mathbf {x}}_T = \left [ \begin {array}{ccc} {{\mathbf {a}}_T^T } & {{\mathbf {q}}_T } & {{\mathbf {z}}_T^T } \end {array} \right ]^T\) stands for the unknowns which are also present in the separate solutions. These estimates are accompanied by their normal equation matrix NT, which in view of its rank deficiency satisfies
$$\displaystyle \begin{aligned} {\mathbf{N}}_T {\mathbf{E}}_T = \left[ \begin{array}{ccc} {{\mathbf{N}}_{{\mathbf{a}}_T} } & {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } } & {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T } } & {{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{z}}_T } } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{E}}_{\mathbf{a}} } \\ {{\mathbf{E}}_{\mathbf{q}} } \\ {{\mathbf{E}}_{\mathbf{z}} } \end{array} \right] = \mathbf{0},\qquad \quad {\mathbf{E}}_T^T {\mathbf{N}}_T = \mathbf{0}. \end{aligned} $$
(400)
The normal equations formed on the basis of the observation equations (399) are
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{N}}_{{\mathbf{x}}_T} } & {{\mathbf{N}}_{{\mathbf{x}}_T {\mathbf{p}}_T } } \\ {{\mathbf{N}}_{{\mathbf{x}}_T {\mathbf{p}}_T }^T } & {{\mathbf{N}}_{{\mathbf{p}}_T } } \end{array} \right]\left[ \begin{array}{c} {\hat{\mathbf{x}}_T } \\ {\hat{\mathbf{p}}_T } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{{\mathbf{x}}_T } } \\ {{\mathbf{u}}_{{\mathbf{p}}_T } } \end{array} \right], \end{aligned} $$
(401)
where
$$\displaystyle \begin{aligned} &{\mathbf{N}}_{{\mathbf{x}}_T } = {\mathbf{A}}_T^T {\mathbf{N}}_T {\mathbf{A}}_T ,\qquad {\mathbf{N}}_{{\mathbf{x}}_T {\mathbf{p}}_T } = {\mathbf{A}}_T^T {\mathbf{N}}_T {\mathbf{E}}_T ,\qquad \quad {\mathbf{N}}_{{\mathbf{p}}_T } = {\mathbf{E}}_T^T {\mathbf{N}}_T {\mathbf{E}}_T , \\ {} &{\mathbf{u}}_{{\mathbf{x}}_T } = {\mathbf{A}}_T^T {\mathbf{N}}_T {\mathbf{b}}_T ,\quad {\mathbf{u}}_{{\mathbf{p}}_T } = {\mathbf{E}}_T^T {\mathbf{N}}_T {\mathbf{b}}_T . \end{aligned} $$
(402)
In view of \({\mathbf {E}}_T^T {\mathbf {N}}_T = \mathbf {0}\) Eq. (400) it holds that \({\mathbf {N}}_{{\mathbf {x}}_T {\mathbf {p}}_T } = \mathbf {0}\), \({\mathbf {N}}_{{\mathbf {p}}_T } = \mathbf {0}\) and the above normal equations per technique degenerate into two equations \(({\mathbf {A}}_T^T {\mathbf {N}}_T {\mathbf {A}}_T )\hat {\mathbf {x}}_T =\)\({\mathbf {A}}_T^T {\mathbf {N}}_T {\mathbf {b}}_T \) and \(\mathbf {0}\ \hat {\mathbf {p}}_T = \mathbf {0}\) [32]. This simply means that no additional transformation parameters pT (from the joint ITRF to the already determined ITRF of each technique) can be recovered when the rigorous coefficient matrices of the normal equations per technique NT are used as weight matrices. In the rigorous approach each of these matrices has a rank deficiency of 14 due to the lack of definition of the initial epoch reference system and its temporal evolution (rate). The resulting normal equations will be satisfied by any value of the transformation parameters pT whatsoever!

If however the normal equations happen to have full rank, due to a departure from strict rigor, the relation NTET = 0 does not hold anymore and the present approach can be indeed realized. The reasons for the departure from a strictly rigorous approach can be attributed either to the use of non-minimal constraints, such as the infamous loose constraints, or to the introduction of prior information on initial coordinates and velocities (e.g., estimates from a previous ITRF version) with an (incorrectly) non-singular weight matrix. In such a case the joint normal equations, follow from the addition of the ones for each technique and those from the tie observations at collocation ties.

We will give first the solution for the case where EOP data are ignored, which is in our opinion the best approach to use, due to the problems related to the EOPs as already has been explained.

It is obvious from (397) that the set of pseudo-observations \(\hat {\mathbf {z}}_{k,T\vert T} \) is non-adjustable with respect to the parameters zk so that we can implement Proposition 7 and the relevant results. We can adjust the remaining pseudo-observations by simply using the reduced weight matrix.

The remaining pseudo-observations are in this case the per technique estimates of initial coordinates and velocities \(\hat {\mathbf {a}}_{T\vert T} = \left [ \begin {array}{c} {\hat {\mathbf {x}}_{0,T\vert T} } \\ {\hat {\mathbf {v}}_{T\vert T} } \end {array} \right ]\) and the remaining pseudo-observation equations are simply \(\hat {\mathbf {a}}_{T\vert T} = {\mathbf {a}}_T {\mathbf {E}}_{\mathbf {a}} \mathbf {p} + {\mathbf {e}}_{{\mathbf {a}}_T } \), as described by Eq. (395). They are related to those for \(\hat {\mathbf {z}}_{k,T\vert T} \), through the normal equations weight matrix \(\left [ \begin {array}{cc} {{\mathbf {N}}_{{\mathbf {a}}_T } } & {{\mathbf {N}}_{{\mathbf {a}}_T {\mathbf {z}}_T } } \\ {{\mathbf {N}}_{{\mathbf {a}}_T {\mathbf {z}}_T }^T } & {{\mathbf {N}}_{{\mathbf {z}}_T } } \end {array} \right ]\), and the relevant reduced weight matrix becomes \(\bar {\mathbf {N}}_{{\mathbf {a}}_T } = {\mathbf {N}}_{{\mathbf {a}}_T } -\)\({\mathbf {N}}_{{\mathbf {a}}_T {\mathbf {z}}_T } {\mathbf {N}}_{{\mathbf {z}}_T }^{ - 1} {\mathbf {N}}_{{\mathbf {a}}_T {\mathbf {z}}_T }^{T} \), where
$$\displaystyle \begin{aligned} {\mathbf{N}}_{{\mathbf{a}}_T } = \left[ \begin{array}{cc} {{\mathbf{N}}_{{\mathbf{x}}_{0T} } } & {{\mathbf{N}}_{{\mathbf{x}}_{0T} {\mathbf{v}}_T } } \\ {{\mathbf{N}}_{{\mathbf{x}}_{0T} {\mathbf{v}}_T }^T } & {{\mathbf{N}}_{{\mathbf{v}}_T } } \end{array} \right],\qquad {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } = \left[ \begin{array}{c} {{\mathbf{N}}_{{\mathbf{x}}_{0T} {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{{\mathbf{v}}_{T} {\mathbf{z}}_T } } \end{array} \right]. \end{aligned} $$
(403)
The contribution to the normal equations from each technique becomes
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{{\mathbf{a}}_T } } & {\bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} } \\ {{\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T } } & {{\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} } \end{array} \right]\left[ \begin{array}{c} {\hat{\mathbf{a}}_T } \\ {\hat{\mathbf{p}}_T } \end{array} \right] = \left[ \begin{array}{c} {\bar{\mathbf{N}}_{{\mathbf{a}}_T } \hat{\mathbf{a}}_{T\vert T} } \\ {{\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T } \hat{\mathbf{a}}_{T\vert T} } \end{array} \right]. \end{aligned} $$
(404)
Adding these contributions and the ones \({\mathbf {N}}_{c,\mathbf {a}} \hat {\mathbf {a}}_T = {\mathbf {u}}_{c,\mathbf {a}} \) from the tie observations at collocation sites we arrive at the total normal equations
$$\displaystyle \begin{aligned} \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} } & {{\mathbf{N}}_{\mathbf{ap}} } \\ {{\mathbf{N}}_{\mathbf{ap}}^T } & {{\mathbf{N}}_{\mathbf{p}} } \end{array} \right]\left[ \begin{array}{c} \hat{\mathbf{a}} \\ \hat{\mathbf{p}} \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{u}}_{\mathbf{a}} } \\ {{\mathbf{u}}_{\mathbf{p}} } \end{array} \right], \end{aligned} $$
(405)
where
$$\displaystyle \begin{aligned} &{\mathbf{N}}_{\mathbf{a}} = \mathrm{BD}(\bar{\mathbf{N}}_{{\mathbf{a}}_T } ),\quad {\mathbf{N}}_{\mathbf{ap}} = \mathrm{BD}(\bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} ),\quad {\mathbf{N}}_{\mathbf{p}} = \mathrm{BD}({\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} ), \\ {} &\hat{\mathbf{a}} = \mathrm{BC}(\hat{\mathbf{a}}_T ),\quad \hat{\mathbf{p}} = \mathrm{BC}(\hat{\mathbf{p}}_T ),\quad {\mathbf{u}}_{\mathbf{a}} = \mathrm{BC}(\bar{\mathbf{N}}_{{\mathbf{a}}_T } \hat{\mathbf{a}}_{T\vert T} ),\quad {\mathbf{u}}_{\mathbf{p}} = \mathrm{BC}({\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T } \hat{\mathbf{a}}_{T\vert T} ). \end{aligned} $$
(406)
To solve the above normal equation a set of minimal constraints, of the general form \({\mathbf {C}}_{\mathbf {a}}^T \mathbf {a} + {\mathbf {C}}_{\mathbf {p}}^T \mathbf {p} = \mathbf {d}\), must be introduced. In this case the unique solution becomes
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} \hat{\mathbf{a}} \\ \hat{\mathbf{p}} \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{ap}}+ {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{p}}^T } \\ {{\mathbf{N}}_{\mathbf{ap}}^T + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{p}} + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{p}}^T } \end{array} \right]^{ - 1}\left[ \begin{array}{c} {{\mathbf{u}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} \mathbf{d}} \\ {{\mathbf{u}}_{\mathbf{a}} + {\mathbf{C}}_{\mathbf{p}} \mathbf{d}} \end{array} \right], \end{aligned} $$
(407)
and has covariance factor matrix
$$\displaystyle \begin{aligned} &\left[ \begin{array}{cc} {{\mathbf{Q}}_{\hat{\mathbf{a}}} } & {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{p}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{p}}}^T } & {{\mathbf{Q}}_{\hat{\mathbf{p}}} } \end{array} \right] =\\ &= \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{ap}}+ {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{p}}^T } \\ {{\mathbf{N}}_{\mathbf{ap}}^T + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{p}} + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{p}}^T } \end{array} \right]^{ - 1}\left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} } & {{\mathbf{N}}_{\mathbf{ap}} } \\ {{\mathbf{N}}_{\mathbf{ap}}^T } & {{\mathbf{N}}_{\mathbf{p}} } \end{array} \right]\times\\ &\qquad \qquad \qquad \qquad \times\left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{ap}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{p}}^T } \\ {{\mathbf{N}}_{\mathbf{ap}}^T + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{p}} + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{p}}^T } \end{array} \right]^{ - 1} =\\ {} &= \left[ \begin{array}{cc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} + {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{ap}}+ {\mathbf{C}}_{\mathbf{a}} {\mathbf{C}}_{\mathbf{p}}^T } \\ {{\mathbf{N}}_{\mathbf{ap}}^T + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{p}} + {\mathbf{C}}_{\mathbf{p}} {\mathbf{C}}_{\mathbf{p}}^T } \end{array} \right]^{ - 1} - \left[ \begin{array}{cc} {{\mathbf{E}}_{\mathbf{a}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{a}}^T } & {{\mathbf{E}}_{\mathbf{a}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{p}}^T } \\ {{\mathbf{E}}_{\mathbf{p}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{a}}^T } & {{\mathbf{E}}_{\mathbf{p}} {\mathbf{R}}^{ - 1}{\mathbf{E}}_{\mathbf{p}}^T } \end{array} \right], \end{aligned} $$
(408)
where \(\mathbf {R} = ({\mathbf {E}}_{\mathbf {a}}^T {\mathbf {C}}_{\mathbf {a}} + {\mathbf {E}}_{\mathbf {p}}^T {\mathbf {C}}_{\mathbf {p}} )({\mathbf {C}}_{\mathbf {a}}^T {\mathbf {E}}_{\mathbf {a}} + {\mathbf {C}}_{\mathbf {p}}^T {\mathbf {E}}_{\mathbf {p}} )\).
The transformation parameter optimal estimates can be recovered if desired from
$$\displaystyle \begin{aligned} \hat{\mathbf{z}}_T = \hat{\mathbf{z}}_{T\vert T} + {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T (\hat{\mathbf{a}}_{T\vert T} - \hat{\mathbf{a}}_T ) - ({\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T {\mathbf{E}}_{\mathbf{a}} + {\mathbf{E}}_{\mathbf{z}} )\hat{\mathbf{p}}_T . \end{aligned} $$
(409)
A non-rigorous version leading to a suboptimal solution, results from ignoring altogether the transformation parameter estimates \(\hat {\mathbf {z}}_{k,T\vert T} \) from the four techniques. The normal equations in this case have exactly the same form as above, with the only difference being that the unreduced weight matrix \({\mathbf {N}}_{{\mathbf {a}}_T } \) appears in the place of the reduced weight matrix \(\bar {\mathbf {N}}_{{\mathbf {a}}_T } \).

The EOPs can be independently combined a posteriori. The original EOP time series must be first converted to the reference system of each technique using the updated estimates \(\hat {\mathbf {z}}_T \) rather than the ones \(\hat {\mathbf {z}}_{T\vert T} \) from each technique. Next they must converted to the final ITRF reference system, utilizing the relevant transformation parameter estimates \(\hat {\mathbf {p}}_T \). Once they all refer to the same reference system they can be combined as previously explained in Sect. 16.1.

Despite our reservations, we include for the sake of completeness the approach where the per technique EOP estimates \(\hat {\mathbf {q}}_{T\vert T} = {\mathbf {L}}_T \mathbf {q} + {\mathbf {E}}_{\mathbf {q}} {\mathbf {p}}_T + {\mathbf {e}}_{{\mathbf {q}}_T } \) are included in the combination step. The difference in this case is that the weight matrix associated with the per technique pseudo-observation equations and the resulting reduced weight matrix are, respectively,
$$\displaystyle \begin{aligned} & {\mathbf{N}}_{{\mathbf{x}}_T } = \left[\begin{array}{ccc} {{\mathbf{N}}_{{\mathbf{a}}_T } } & {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } } & {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T } } & {{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{z}}_T } } \end{array} \right], \end{aligned} $$
(410)
$$\displaystyle \begin{aligned} &\bar{\mathbf{N}}_{{\mathbf{x}}_T } = \left[ \begin{array}{cc} {\bar{\mathbf{N}}_{{\mathbf{a}}_T } } & {\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } } \\ {\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T }^T } & {\bar{\mathbf{N}}_{{\mathbf{q}}_T } } \end{array} \right] = \left[ \begin{array}{cc} {{\mathbf{N}}_{{\mathbf{a}}_T } } & {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } } \\ {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T } } \end{array} \right] - \left[ \begin{array}{c} {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } } \\ {{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T } } \end{array} \right]{\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} \left[ \begin{array}{cc} {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T }^T } \end{array} \right] = \\ &\quad \ \ \, = \left[ \begin{array}{cc} {{\mathbf{N}}_{{\mathbf{a}}_T } - {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } - {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T }^{T} } \\ {{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T }^T - {\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T } {\mathbf{N}}_{{\mathbf{z}}_T }^{-1} {\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T } & {{\mathbf{N}}_{{\mathbf{q}}_T } - {\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T } {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} {\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T }^{T} } \end{array} \right]. \end{aligned} $$
(411)
The normal equations from each technique are
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {\bar{\mathbf{N}}_{{\mathbf{a}}_T } } & {\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_T } & {\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{p}}_T } } \\ {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T }^T } & {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{L}}_T } & {\bar{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{p}}_T } } \\ {\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{p}}_T }^T } & {\bar{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{p}}_T }^T } & {\bar{\mathbf{N}}_{{\mathbf{p}}_T } } \end{array} \right]\left[ \begin{array}{c} \hat{\mathbf{a}} \\ \hat{\mathbf{q}} \\ {\hat{\mathbf{p}}_T } \end{array} \right] = \left[ \begin{array}{c} {\tilde{\mathbf{u}}_{{\mathbf{a}}_T } } \\ {\tilde{\mathbf{u}}_{{\mathbf{q}}_T } } \\ {\tilde{\mathbf{u}}_{{\mathbf{p}}_T } } \end{array} \right], \end{aligned} $$
(412)
where
$$\displaystyle \begin{aligned} &\tilde{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{p}}_T } = \bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} + \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{E}}_{\mathbf{q}} , \\ &\tilde{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{p}}_T } = {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T {\mathbf{E}}_{\mathbf{a}} + {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T} {\mathbf{E}}_{\mathbf{q}} , \\ &\tilde{\mathbf{N}}_{{\mathbf{p}}_T } = {\mathbf{E}}_{\mathbf{a}}^T (\bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} + \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{p}}_T } {\mathbf{E}}_{\mathbf{q}} ) + {\mathbf{E}}_{\mathbf{q}}^T (\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T } {\mathbf{E}}_{\mathbf{a}} + \bar{\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{E}}_{\mathbf{q}} ), \\ &\tilde{\mathbf{u}}_{{\mathbf{a}}_T } = \bar{\mathbf{N}}_{{\mathbf{a}}_T } \hat{\mathbf{a}}_{T\vert T} + \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T} , \\ &\tilde{\mathbf{u}}_{{\mathbf{a}}_T } = {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T \hat{\mathbf{a}}_{T\vert T} + {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T} , \\ &\tilde{\mathbf{u}}_{{\mathbf{p}}_T } = ({\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T } + {\mathbf{E}}_{\mathbf{q}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T )\hat{\mathbf{a}}_{T\vert T} + ({\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } + {\mathbf{E}}_{\mathbf{q}}^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } )\hat{\mathbf{q}}_{T\vert T} . {} \end{aligned} $$
(413)
The joint normal equations follow from the addition of the above contributions from each technique and the ones \({\mathbf {N}}_{c,\mathbf {a}} \hat {\mathbf {a}}_T = {\mathbf {u}}_{c,\mathbf {a}} \) from the tie observations at collocation sites. They have the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} } & {{\mathbf{N}}_{\mathbf{aq}} } & {{\mathbf{N}}_{\mathbf{ap}} } \\ {{\mathbf{N}}_{\mathbf{aq}}^T } & {{\mathbf{N}}_{\mathbf{q}} } & {{\mathbf{N}}_{\mathbf{qp}} } \\ {{\mathbf{N}}_{\mathbf{ap}}^T } & {{\mathbf{N}}_{\mathbf{qp}}^T } & {{\mathbf{N}}_{\mathbf{p}} } \end{array} \right]\left[ \begin{array}{c} \hat{\mathbf{a}} \\ \hat{\mathbf{q}} \\ \hat{\mathbf{p}} \end{array} \right]=\left[ \begin{array}{c} {{\mathbf{u}}_{\mathbf{a}} + {\mathbf{u}}_{c,\mathbf{a}} } \\ {{\mathbf{u}}_{\mathbf{q}} } \\ {{\mathbf{u}}_{\mathbf{p}} } \end{array} \right], \end{aligned} $$
(414)
where
$$\displaystyle \begin{aligned} &\hat{\mathbf{a}} = \mathrm{BC}(\hat{\mathbf{a}}_T ),\quad \hat{\mathbf{p}} = \mathrm{BC}(\hat{\mathbf{p}}_T ), \end{aligned} $$
(415)
$$\displaystyle \begin{aligned} &{\mathbf{N}}_{\mathbf{a}} = \mathrm{BD}(\bar{\mathbf{N}}_{{\mathbf{a}}_T } ),\\ &{\mathbf{N}}_{\mathbf{aq}} = \mathrm{BC}(\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{L}}_T ), ={\mathbf{N}}_{\mathbf{ap}} = \mathrm{BD}(\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } ) = \mathrm{BD}(\bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} + \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{E}}_{\mathbf{q}} ), \\ &{\mathbf{N}}_{\mathbf{q}} = \sum_T {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{\mathbf{q}T} {\mathbf{L}}_T ,} \\ &{\mathbf{N}}_{\mathbf{qp}} = \mathrm{BR}(\tilde{\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{p}}_T } ) = \mathrm{BR}({\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T {\mathbf{E}}_{\mathbf{a}} + {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } {\mathbf{E}}_{\mathbf{q}} ), \\ {} &{\mathbf{N}}_{\mathbf{p}} = \mathrm{BD}(\tilde{\mathbf{N}}_{{\mathbf{p}}_T } ) = \mathrm{BD}\left[ {{\mathbf{E}}_{\mathbf{a}}^T (\bar{\mathbf{N}}_{{\mathbf{a}}_T } {\mathbf{E}}_{\mathbf{a}} + \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } {\mathbf{E}}_{\mathbf{q}} ) + {\mathbf{E}}_{\mathbf{q}}^T (\bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T {\mathbf{E}}_{\mathbf{a}} + \bar{\mathbf{N}}_{{\mathbf{q}}_T } ){\mathbf{E}}_{\mathbf{q}} } \right], \end{aligned} $$
(416)
$$\displaystyle \begin{aligned} &{\mathbf{u}}_{\mathbf{a}} = \mathrm{BC}(\tilde{\mathbf{u}}_{{\mathbf{a}}_T } ) = \mathrm{BC}(\bar{\mathbf{N}}_{{\mathbf{a}}_T } \hat{\mathbf{a}}_{T\vert T} + \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T} ), \\ &{\mathbf{u}}_{\mathbf{q}} = \sum_T \tilde{\mathbf{u}}_{{\mathbf{q}}_T } = \sum_T {\left( {{\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T \hat{\mathbf{a}}_{T\vert T} {\mathbf{L}}_T^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } \hat{\mathbf{q}}_{T\vert T} } \right)} ,\\ &{} {\mathbf{u}}_{\mathbf{p}} = \mathrm{BC}(\tilde{\mathbf{u}}_{{\mathbf{p}}_T } ) = \mathrm{BC}\left[ {({\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T } + {\mathbf{E}}_{\mathbf{q}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T )\hat{\mathbf{a}}_{T\vert T} + ({\mathbf{E}}_{\mathbf{a}}^T \bar{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{q}}_T } + {\mathbf{E}}_{\mathbf{q}}^T \bar{\mathbf{N}}_{{\mathbf{q}}_T } )\hat{\mathbf{q}}_{T\vert T} } \right]. \end{aligned} $$
(417)
A unique solution to the normal equations can be obtained after the introduction of a set of minimal constraints of the general form \({\mathbf {C}}_{\mathbf {a}}^T \mathbf {a} + {\mathbf {C}}_{\mathbf {q}}^T \mathbf {q} + {\mathbf {C}}_{\mathbf {a}}^T \mathbf {p} = \mathbf {d}\), following the general relations (78) and (79) or (85) and (87). Typically constraints of the form \({\mathbf {C}}_{\mathbf {a}}^T \mathbf {a} + {\mathbf {C}}_{\mathbf {p}}^T \mathbf {p} = \mathbf {d}\) which do not involve EOP parameters are implemented. It suffices to use minimal constraints \({\mathbf {C}}_{\mathbf {a}}^T \mathbf {a} = \mathbf {d}\) involving only initial coordinates and velocities. Since any minimally constrained solution can be easily converted to a solution satisfying desired minimal constraints, our advice is to use partial inner constraints \({\mathbf {E}}_{\mathbf {a}}^T \mathbf {a} = \mathbf {0}\), with the simpler solution
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} \hat{\mathbf{a}} \\ \hat{\mathbf{q}} \\ \hat{\mathbf{p}} \end{array} \right] = \left[ \begin{array}{ccc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} + {\mathbf{E}}_{\mathbf{a}} {\mathbf{E}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{aq}} } & {{\mathbf{N}}_{\mathbf{ap}} } \\ {{\mathbf{N}}_{\mathbf{aq}}^T } & {{\mathbf{N}}_{\mathbf{q}} } & {{\mathbf{N}}_{\mathbf{qp}} } \\ {{\mathbf{N}}_{\mathbf{ap}}^T } & {{\mathbf{N}}_{\mathbf{qp}}^T } & {{\mathbf{N}}_{\mathbf{p}} } \end{array} \right]^{ - 1}\left[ \begin{array}{c} {{\mathbf{u}}_{\mathbf{a}} + {\mathbf{u}}_{c,\mathbf{a}} } \\ {{\mathbf{u}}_{\mathbf{q}} } \\ {{\mathbf{u}}_{\mathbf{p}} } \end{array} \right], \end{aligned} $$
(418)
and covariance factor matrices
$$\displaystyle \begin{aligned} \left[ \begin{array}{ccc} {{\mathbf{Q}}_{\hat{\mathbf{a}}} } & {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{q}}} } & {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{p}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{q}}}^T } & {{\mathbf{Q}}_{\hat{\mathbf{q}}} } & {{\mathbf{Q}}_{\hat{\mathbf{q}}\hat{\mathbf{p}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{p}}}^T } & {{\mathbf{Q}}_{\hat{\mathbf{q}}\hat{\mathbf{p}}}^T } & {{\mathbf{Q}}_{\hat{\mathbf{p}}} } \end{array} \right] & = \left[ \begin{array}{ccc} {{\mathbf{N}}_{\mathbf{a}} + {\mathbf{N}}_{c,\mathbf{a}} + {\mathbf{E}}_{\mathbf{a}} {\mathbf{E}}_{\mathbf{a}}^T } & {{\mathbf{N}}_{\mathbf{aq}} } & {{\mathbf{N}}_{\mathbf{ap}} } \\ {{\mathbf{N}}_{\mathbf{aq}}^T } & {{\mathbf{N}}_{\mathbf{q}} } & {{\mathbf{N}}_{\mathbf{qp}} } \\ {{\mathbf{N}}_{\mathbf{ap}}^T } & {{\mathbf{N}}_{\mathbf{aq}}^T } & {{\mathbf{N}}_{\mathbf{p}} } \end{array} \right]^{ - 1} -\\ &\quad - \left[ \begin{array}{ccc} {{\mathbf{E}}_{\mathbf{a}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{a}}^T } & \ \ {{\mathbf{E}}_{\mathbf{a}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{q}}^T } & \ \ {{\mathbf{E}}_{\mathbf{a}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{p}}^T } \\ {{\mathbf{E}}_{\mathbf{q}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{a}}^T } & \ \ {{\mathbf{E}}_{\mathbf{q}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{q}}^T } & \ \ {{\mathbf{E}}_{\mathbf{q}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{p}}^T } \\ {{\mathbf{E}}_{\mathbf{p}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{a}}^T } & \ {{\mathbf{E}}_{\mathbf{p}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{q}}^T } & \ {{\mathbf{E}}_{\mathbf{p}} ({\mathbf{E}}_{\mathbf{a}}^T {\mathbf{E}}_{\mathbf{a}} )^{ - 2}{\mathbf{E}}_{\mathbf{p}}^T } \end{array} \right]. \end{aligned} $$
(419)
Updated estimates of the transformation parameters can be obtained if desired, using the relation
$$\displaystyle \begin{aligned} \hat{\mathbf{z}}_T & = \hat{\mathbf{z}}_{T\vert T} + {\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} [{\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^{ T} (\hat{\mathbf{a}}_{T\vert T} - \hat{\mathbf{a}}_T ) + {\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T }^{T} (\hat{\mathbf{q}}_{T\vert T} - {\mathbf{L}}_T \hat{\mathbf{q}})]-\\ &\quad - [{\mathbf{N}}_{{\mathbf{z}}_T }^{ - 1} ({\mathbf{N}}_{{\mathbf{a}}_T {\mathbf{z}}_T }^T {\mathbf{E}}_{\mathbf{a}} + {\mathbf{N}}_{{\mathbf{q}}_T {\mathbf{z}}_T }^T + {\mathbf{E}}_{\mathbf{q}} ) + {\mathbf{E}}_{\mathbf{z}} ]\hat{\mathbf{p}}_T . \end{aligned} $$
(420)
The related covariance matrices can be sequentially computed according to Eqs. (311), which in this case where a → [aTqTpT]T, takes the form
$$\displaystyle \begin{aligned} &\left[ \begin{array}{c} {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{q}}\hat{\mathbf{z}}}} \\ {{\mathbf{Q}}_{\hat{\mathbf{p}}\hat{\mathbf{z}}}} \end{array} \right] = - \left[ \begin{array}{ccc} {{\mathbf{Q}}_{\hat{\mathbf{a}}} } & {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{q}}} } & {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{p}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{q}}}^T } & {{\mathbf{Q}}_{\hat{\mathbf{q}}} } & {{\mathbf{Q}}_{\hat{\mathbf{q}}\hat{\mathbf{p}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{p}}}^T } & {{\mathbf{Q}}_{\hat{\mathbf{q}}\hat{\mathbf{p}}}^T } & {{\mathbf{Q}}_{\hat{\mathbf{p}}} } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{N}}_{\mathbf{az}} } \\ {{\mathbf{N}}_{\mathbf{qz}} } \\ {{\mathbf{N}}_{\mathbf{pz}} } \end{array} \right]{\mathbf{N}}_{\mathbf{z}}^{ - 1} , \\ {} &{\mathbf{Q}}_{\hat{\mathbf{z}}} = {\mathbf{N}}_{\mathbf{z}}^{ - 1} - {\mathbf{N}}_{\mathbf{z}}^{ - 1} [{\begin{array}{ccc} {{\mathbf{N}}_{\mathbf{az}}^T } & {{\mathbf{N}}_{\mathbf{qz}}^T } & {{\mathbf{N}}_{\mathbf{pz}}^T } \end{array} }]\left[ \begin{array}{c} {{\mathbf{Q}}_{\hat{\mathbf{a}}\hat{\mathbf{z}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{q}}\hat{\mathbf{z}}} } \\ {{\mathbf{Q}}_{\hat{\mathbf{p}}\hat{\mathbf{z}}} } \end{array} \right]. \end{aligned} $$
(421)

18 ITRF Formulation: Some Remarks on the Origin and Scale of the Reference System

In our previous discussions, we examined the use of transformation parameters zk connecting the final reference system to be introduced by minimal constraints with the already established one for the station coordinates in each solution at epoch tk. We have not though specified which of the seven transformation parameters (rotation angles θ1, θ2, θ3, translation components d1, d2, d3, and scale parameter s) should be included in each case. They should be all included if the original observations were invariant with respect to rotations, translations and change of scale. This is not the case however. All space techniques have their own scale, i.e., their own unit of length which follows from the time unit as realized through the use of a particular set of atomic clocks. SLR involves tracking of low orbiting satellites which sense the gravity field of the earth and hence the position of the geocenter. The same is true for DORIS and GPS to a certain extent, while only VLBI observations are completely invariant to translations due to the practically infinite distance of the observed extragalactic radio sources. In computational work, however, one needs to take into consideration not only the theoretically absolute rank defects but also close to rank defect situations. In this respect, the rank deficiency indices introduced by Chatzinikos and Dermanis [21], already discussed in chapter 7, provide the means for not only detecting, but also for interpreting and quantifying rank defect or close to rank defect situations. It turns out that only SLR gives reliable geocenter information, while only scale provided by VLBI and SLR is worth considering in combined solutions.

We will discuss first how geocenter information should be incorporated in the simpler case of stacking an SLR coordinate time series alone. Geocentric coordinates appear to be indispensable for geophysical applications but not so for geodetic positioning and mapping applications. To understand this, consider the extreme hypothetical case of an earth with a completely rigid lithosphere, with respect to which the position of the geocenter varies, as a consequence of mass redistribution within the earth. In this case, the obvious geodetic choice is a reference system fixed with respect to the lithosphere. But even in the real situation of a deforming lithosphere it is reasonable to seek a reference system with respect to which coordinates do not demonstrate unnecessary variations, for example the one established by kinematic constraints, where the origin is either the barycenter of the geodetic network, or more generally one with constant barycenter coordinates. On the other hand, even in geophysical applications, the call for a geocentric reference system does not mean that coordinates will follow the geocenter with its variations of relatively higher frequencies, but rather a smoothed version of the geocenter, where e.g., non-linear temporal variations are left out [43, 44, 57]. In any case, the incorporation of a linear-in-time coordinate variation model, filters out such nonlinear variations and does not allow them to influence the choice of the reference system, either barycentric or geocentric. At the bottom line, the choice between a geocentric or a barycentric reference system is a pseudo-problem, since coordinates can be easily converted from one to the other.

Returning to the problem of establishing a geocentric reference system in the stacking of SLR coordinate time series, there are two choices. The first is to introduce only rotation and scale transformation parameters zk for each epoch tk, thus allowing the geocentric reference system of each epoch to enforce a geocentric reference system in the final solution, where only orientation, scale and their rates be defined via minimal constraints. (Scale information in SLR has been ignored here for the sake of the argument.) The second is to introduce all possible 7 transformation parameters per epoch and to establish a geocentric reference system either through appropriate minimal constraints or by a posteriori conversion.

Turning to the stacking model (133), (314)-(316) we may split the term Eizk, in two terms
$$\displaystyle \begin{aligned} {\mathbf{E}}_i {\mathbf{z}}_k {=} \left[ {[{\mathbf{x}}_{0i}^{ap} \times ]\quad {\mathbf{I}}_3 \quad {\mathbf{x}}_{0i}^{ap} } \right]\left[ \begin{array}{c} {\boldsymbol{\uptheta }_k } \\ {{\mathbf{d}}_k } \\ {s_k } \end{array} \right] {=} \left[ {[{\mathbf{x}}_{0i}^{ap} \times ]\quad {\mathbf{x}}_{0i}^{ap} } \right]\left[ \begin{array}{c} {\boldsymbol{\uptheta }_k } \\ {s_k } \end{array} \right] + {\mathbf{d}}_k {\equiv} {\mathbf{E}}_{a,i} {\mathbf{z}}_{a,k} + {\mathbf{E}}_{b,i} {\mathbf{z}}_{b,k} , \end{aligned} $$
(422)
so that for all stations, the choice is between the use of the restricted observation equations
$$\displaystyle \begin{aligned} \mathbf{b} = ({\mathbf{1}}_{m} \otimes {\mathbf{I}}_{3n} )\delta{\mathbf{x}}_0 + (\boldsymbol{\uptau } \otimes {\mathbf{I}}_{3n} )\delta\mathbf{v} + ({\mathbf{I}}_m \otimes {\mathbf{E}}_a ){\mathbf{z}}_a + \mathbf{e} \equiv \mathbf{J}\delta{\mathbf{x}}_{0} + {\mathbf{J}}_t \delta\mathbf{v} + {\mathbf{G}}_a {\mathbf{z}}_a + \mathbf{e},\end{aligned} $$
(423)
and the original extended one
$$\displaystyle \begin{aligned} \mathbf{b} &= ({\mathbf{1}}_{m} \otimes {\mathbf{I}}_{3n} )\delta{\mathbf{x}}_0 + (\boldsymbol{\uptau } \otimes {\mathbf{I}}_{3n} )\delta\mathbf{v} + ({\mathbf{I}}_m \otimes {\mathbf{E}}_a ){\mathbf{z}}_a + ({\mathbf{I}}_{m}\otimes {\mathbf{E}}_{b}){\mathbf{z}}_{b} + \mathbf{e} \equiv \\ &\qquad \qquad \qquad \qquad \qquad \qquad \equiv \mathbf{J}\delta{\mathbf{x}}_0 + {\mathbf{J}}_t \delta \mathbf{v} + {\mathbf{G}}_a {\mathbf{z}}_a + {\mathbf{G}}_b {\mathbf{z}}_b + \mathbf{e} \end{aligned} $$
(424)
In general, adjustments with a restricted model b = A1x1 + e (R) and an extended one b = A1x1 + A2x2 + e (E) lead to different estimates for estimable quantities. In the rank deficient case the models are equivalent when the observable estimates \(\hat {\mathbf {y}}_R = {\mathbf {A}}_1 \hat {\mathbf {x}}_{1(R)} \) and \(\hat {\mathbf {y}}_E = {\mathbf {A}}_1 \hat {\mathbf {x}}_{1(E)} + {\mathbf {A}}_2 \hat {\mathbf {x}}_{2(E)} \) are identical. This will happen only when the columns of A2 already belong to the span of the columns of A1, in mathematical terms R(A2) ⊆ R(A1). In this case R(A2) = R([A1A2]), the subspace M of the observables is the same and the orthogonal projection \(\hat {\mathbf {y}}\) of the observations b on M will be the same. In our case the question is whether all the rows of the matrix
$$\displaystyle \begin{aligned} \mathop{({\mathbf{I}}_m \otimes {\mathbf{E}}_b)}\limits_{(3nm)\times (4m)} = \left[ \begin{array}{ccccc} {{\mathbf{E}}_b } & \cdots & \mathbf{0} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & {{\mathbf{E}}_b } & \cdots & \vdots \\ \vdots & \ddots & \vdots & \ddots & \mathbf{0} \\ \mathbf{0} & \cdots & \mathbf{0} & \cdots & {{\mathbf{E}}_b } \end{array} \right], \end{aligned} $$
(425)
can be expressed as linear combinations of the rows of the matrix
$$\displaystyle \begin{aligned}{}[{\mathbf{1}}_m \otimes {\mathbf{I}}_{3n} \quad \boldsymbol{\uptau } \otimes {\mathbf{I}}_{3n} \quad {\mathbf{I}}_m \otimes {\mathbf{E}}_a ] = \left[ \begin{array}{ccccccc} {{\mathbf{I}}_{3n} } & {(t_1 - t_0 ){\mathbf{I}}_{3n} } & {{\mathbf{E}}_{a} } & \cdots & \mathbf{0} & \cdots & \mathbf{0} \\ \vdots & \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ {{\mathbf{I}}_{3n} } & {(t_k - t_0 ){\mathbf{I}}_{3n} } & \mathbf{0} & \cdots & {{\mathbf{E}}_{a} } & \cdots & \vdots \\ \vdots & \vdots & \vdots & \ddots & \vdots & \ddots & \mathbf{0} \\ {{\mathbf{I}}_{3n} } & {(t_m - t_0 ){\mathbf{I}}_{3n} } & \mathbf{0} & \cdots & \mathbf{0} & \cdots & {{\mathbf{E}}_{a} } \end{array} \right], \end{aligned} $$
(426)
where
$$\displaystyle \begin{aligned} \mathop{{\mathbf{E}}_b }\limits_{(3n)\times 3} = \left[ \begin{array}{c} {{\mathbf{I}}_3 } \\ \vdots \\ {{\mathbf{I}}_3 } \\ \vdots \\ {{\mathbf{I}}_3 } \end{array} \right],\quad \mathop{{\mathbf{E}}_a }\limits_{(3n)\times 4} \left[ \begin{array}{cc} {[{\mathbf{x}}_{01}^{ap} \times ]} &\quad {{\mathbf{x}}_{01}^{ap} } \\ \vdots & \vdots \\ {[{\mathbf{x}}_{0i}^{ap} \times ]} &\quad {{\mathbf{x}}_{0i}^{ap} } \\ \vdots & \vdots \\ {[{\mathbf{x}}_{0n}^{ap} \times ]} &\quad {{\mathbf{x}}_{0n}^{ap} } \end{array} \right]. \end{aligned} $$
(427)
Since this does not appear to be possible, we come to the conclusion that the estimated shape of the network, as expressed by the station coordinates \(\hat {\mathbf {x}}_i (t) = \hat {\mathbf {x}}_{0i} (t) + (t - t_0 )\hat {\mathbf {v}}_i \), i = 1, 2, …, n, is different in the two approaches, independently of the choice of the reference system.

Let us first clarify what is the rigorous approach to use under the Gauss-Markov model, which is the basis of our computations. The data consist of per epoch station coordinate estimates x(tk) accompanied by weight matrices, which are no others that the coefficient matrices Nx(tk) of the normal equations formed from the actual observations performed within the time interval corresponding to the epoch tk, as already explained in Sect. 13.1, Proposition 4. The rank deficiencies of the Nx(tk) matrices correspond to the transformation characteristics (rotation, displacement, scaling) that leave the primary observables invariant (absolute deficiency) or they vary very little (close to deficiency situation). In any case they can be numerically detected, identified and interpreted by the above mentioned deficiency indices. Only transformation parameters with respect to which Nx(tk) has rank defect, should be introduced in the stacking model. Thus for SLR only rotation parameters should be included, for VLBI only rotation and displacement ones, while all transformation parameters (rotation, translation and scale) should be included for GPS and DORIS. An additional requirement is that the per epoch estimates x(tk) have been correctly obtained by using minimal constraints.

Let us push the argument in favor of rigor one step further. The unknown parameters of interest, i.e., the initial station coordinates and their velocities, are correctly estimated if all observations from all epochs are jointly adjusted. In this case the joint normal equations will be the sum of the per epoch normal equations, with one difference: these will refer not to single epoch coordinates xi(tk), but to initial values x0i and velocities vi, a linear in time model should be used within each epoch also instead of the model with constant coordinates, as actually done. An alternative to the addition of the normal equations is to use the per epochs estimates (of x0i and velocities vi) as pseudo-observations with weight matrices their normal equations coefficient matrices. In both equivalent approaches (Sect. 13.1, Proposition 4) there is no room for transformation parameters and the whole stacking idea in its usual form should be abandoned if one wants to be formally rigorous.

This is not the case however, for a simple reason: the rigorous approach is based upon the basic statistical assumptions of the Gauss-Markov model and is correct when these assumptions (zero mean errors, known error covariance matrix up to a scalar) are consistent with physical reality. This is certainly not the case, because systematic errors do affect the performed observations.

In this respect, the role of transformation parameters in the stacking solution is twofold: On one hand to free the coordinates xi(tk) from the independent solutions at epochs tk from their different reference systems, and on the other hand to absorb a part of the systematic errors, i.e., to remove an error trend that can be expressed by a combination of rotation, translation and scale. Thus there is an advantage to use the extended approach, allowing all 7 transformation parameters and to seek a way to convert the reference system established through minimal constraints to a geocentric one, or one that has a desired scale.

As explained by Chatzinikos and Dermanis [20] different least squares solutions to the stacking problem are related by \({\hat {\mathbf {x}}^{\prime }}_{0i} = \hat {\mathbf {x}}_{0i} + {\mathbf {E}}_i {\mathbf {p}}_0 \), \({\hat {\mathbf {v}}^{\prime }}_i = \hat {\mathbf {v}}_i + {\mathbf {E}}_i \dot {\mathbf {p}}\) and \({\hat {\mathbf {z}}^{\prime }}_k = \hat {\mathbf {z}}_k - {\mathbf {p}}_0 - (t_k - t_0 )\dot {\mathbf {p}}\) (see propositions 1 and 2 of chapter 7). For comparing the extended with the reduced approach we will separate the transformation parameters \(\hat {\mathbf {z}}_k \) in these relations into ones \(\hat {\mathbf {z}}_{a,k} \) that are present in both approaches and ones \(\hat {\mathbf {z}}_{k,b} \) that are present only in the extended approach. In this case the relations between two different least squares solutions, corresponding to a different choice of the spatiotemporal reference system are related through
$$\displaystyle \begin{aligned} &{\hat{\mathbf{x}}^{\prime}}_{0i} = \hat{\mathbf{x}}_{0i} + {\mathbf{E}}_{a,i} {\mathbf{p}}_{a,0} + {\mathbf{E}}_{b,i} {\mathbf{p}}_{b,0} ,\qquad \qquad {\hat{\mathbf{v}}^{\prime}}_i = \hat{\mathbf{v}}_i + {\mathbf{E}}_{a,i} \dot{\mathbf{p}}_a + {\mathbf{E}}_{b,i} \dot{\mathbf{p}}_b , \\ &\hat{\mathbf{z}}_{a,k}^{\prime} = \hat{\mathbf{z}}_{a,k} - {\mathbf{p}}_{a,0} - (t_k - t_0 )\dot{\mathbf{p}}_a ,\qquad \qquad {\hat{\mathbf{z}}^{\prime}}_{b,k} = {\hat{\mathbf{z}}}_{b,k} - {\mathbf{p}}_{b,0} - (t_k - t_0 )\dot{\mathbf{p}}_b .{} \end{aligned} $$
(428)
In the restricted approach \(\hat {\mathbf {z}}_{b,k} = \mathbf {0}\) has been enforced, thus leading to pb,0 = 0 and \(\dot {\mathbf {p}}_b = \mathbf {0}\) with allowable transformations \({\hat {\mathbf {x}}^{\prime }}_{0i} = \hat {\mathbf {x}}_{0i} + {\mathbf {E}}_{a,i} {\mathbf {p}}_{a,0} \), \({\hat {\mathbf {v}}^{\prime }}_i = \hat {\mathbf {v}}_i + {\mathbf {E}}_{a,i} \dot {\mathbf {p}}_a \) between different least squares solutions, corresponding to the reference system characteristics that have not been inherited from the epoch solutions xi(tk). In the extended approach \(\hat {\mathbf {z}}_{b,k} \ne \mathbf {0}\), but the new solution \({\hat {\mathbf {x}}^{\prime }}_{0i} \), \({\hat {\mathbf {v}}^{\prime }}_i \) is selected to be the one closest to that with \(\hat {\mathbf {z}}_{b,k} = \mathbf {0}\), whenever the resulting new transformation parameters \({\hat {\mathbf {z}}^{\prime }}_{b,k} = \hat {\mathbf {z}}_{b,k} - {\mathbf {p}}_{b,0} - (t_k - t_0 )\dot {\mathbf {p}}_b \) are collectively as small as possible. Measuring their total magnitude by \(\sum \limits _{k = 1}^m {\left \|{{\hat {\mathbf {z}}^{\prime }}_{b,k} } \right \|{ }^2} \) we can minimize them collectively by choosing as transformation parameters pb,0, \(\dot {\mathbf {p}}_b \) the ones that make \(\sum \limits _{k = 1}^m {({\hat {\mathbf {z}}^{\prime }}_{b,k} )^T({\hat {\mathbf {z}}^{\prime }}_{b,k} ) = \mathop {\min } \limits _{{\mathbf {p}}_{b,0} ,\dot {\mathbf {p}}_b } } \). This is in fact a linear regression with model \(\hat {\mathbf {z}}_{b,k} = {\mathbf {p}}_{b,0} + (t_k - t_0 )\hat {\mathbf {p}}_b + {\hat {\mathbf {z}}^{\prime }}_{b,k} \) that can be easily solved to provide the optimal values \(\hat {\mathbf {p}}_{b,0} \), \(\hat {\dot {\mathbf {p}}}_b \). Then the desired solution referring to a reference system with characteristic (origin, scale) close to the desired ones is given by \({\hat {\mathbf {x}}^{\prime }}_{0i} = \hat {\mathbf {x}}_{0i} + {\mathbf {E}}_{b,i} \hat {\mathbf {p}}_{b,0} \), \({\hat {\mathbf {v}}^{\prime }}_i = \hat {\mathbf {v}}_i + {\mathbf {E}}_{b,i} \hat {\dot {\mathbf {p}}}_b \), leaving the rest of the transformation parameters unchanged (\({\hat {\mathbf {z}}^{\prime }}_{a,k} = \hat {\mathbf {z}}_{a,k} )\). Thus, the extended approach can be realized as a compromise that tries to find a balance between the need for systematic error removal and the need of closeness to a solution with desired characteristics (geocentric, having the scale of VLBI or SLR).
Coming to the specific problem of converting a solution obtained by the extended approach to one as close to geocentric as possible, we have the case where \(\hat {\mathbf {z}}_{b,k} = \hat {\mathbf {d}}_k \), pb,0 = d0 and \(\dot {\mathbf {p}}_b = \dot {\mathbf {d}}\). The relevant regression takes the form
$$\displaystyle \begin{aligned} \hat{\mathbf{d}}_k = {\mathbf{d}}_0 + (t_k - t_0 )\dot{\mathbf{d}} + {\hat{\mathbf{d}}^{\prime}}_k ,\quad \sum_{k = 1}^m {({\hat{\mathbf{d}}^{\prime}}_k )^T} {\hat{\mathbf{d}}^{\prime}}_k = \mathop{\min }\limits_{{\mathbf{d}}_0 ,\dot{\mathbf{d}}} , \end{aligned} $$
(429)
and has solution
$$\displaystyle \begin{aligned} &\hat{\mathbf{d}}_0 = \frac{1 }{\tau_2 - \tau_1^2 }\left[\tau_{2}\bar{\mathrm{d}}-\frac{\tau_{1}}{m}\sum\nolimits_k {(t_k - t_0 )\hat{\mathbf{d}}_k }\right] , \\ &\hat{\dot{\mathbf{d}}} = \frac{1}{\tau_2 - \tau_1^2 }\left[-\tau_{1}\bar{\mathbf{d}} + \frac{1}{m}\sum\nolimits_k {(t_k - t_0 )\hat{\mathbf{d}}_k }\right] ,{} \end{aligned} $$
(430)
where \(\bar {\mathbf {d}}\) is the mean of the values dk, \(\tau _p = \frac {1}{m}\sum \nolimits _k {(t_k - t_0 )^{p}}\), p = 1, 2.
The change of the reference system of the stacking solution to a “geocentric” one is realized via
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{0i} = \hat{\mathbf{x}}_{0i} + \hat{\mathbf{d}}_0 ,\qquad {\hat{\mathbf{v}}^{\prime}}_i = \hat{\mathbf{v}}_i + \hat{\dot{\mathbf{d}}}. \end{aligned} $$
(431)
In the case of converting a solution of the extended approach to one that is close in scale to that of a specific technique (SLR or VLBI), we have \(\hat {\mathbf {z}}_{b,k} = \hat {s}_k \), pb,0 = s0 and \(\dot {\mathbf {p}}_b = \dot {s}\). The relevant regression takes the form
$$\displaystyle \begin{aligned} \hat{s}_k = s_0 + (t_k - t_0 )\dot{s} + \hat{s}_k^{\prime} ,\qquad \sum_{k = 1}^m {({\hat{s}}^{\prime}_k )^2 = \mathop{\min }\limits_{s_0 ,\dot{s}} } , \end{aligned} $$
(432)
which has solution
$$\displaystyle \begin{aligned} &\hat{s}_0 = \frac{1}{\tau_2 - \tau_1^2 }\left[ { - \tau_1 \overline{s} - \tau_1 \frac{1}{m}\sum\nolimits_k {(t_k - t_0 )\hat{s}_k } } \right], \\ {} &\hat{\dot{s}} = \frac{1}{\tau_2 - \tau_1^2 }\left[ { - \tau_1 \overline{s} + \frac{1}{m}\sum\nolimits_k {(t_k - t_0 )\hat{s}_k } } \right], \end{aligned} $$
(433)
where \(\overline {s}\) is the mean of the values \(\hat {s}_k \), while τ1 and τ2 are the same as above.
The change of the reference system of the stacking solution to one with scale close to that of a particular technique is realized via
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{0i} = \hat{\mathbf{x}}_{0i} + \hat{s}_0 {\mathbf{x}}_{0i}^{ap} ,\quad {\hat{\mathbf{v}}^{\prime}}_i = \hat{\mathbf{v}}_i + \hat{\dot{s}}{\mathbf{x}}_{0i}^{ap} . \end{aligned} $$
(434)
When all the techniques are combined to get joint estimates \(\hat {\mathbf {x}}_{0i}, \hat {\mathbf {v}}_i \) and \(\hat {\mathbf {z}}_{k,T}, T =\)V, S, G, D, an equally valid solution in a different reference system is given by
$$\displaystyle \begin{aligned} {\hat{\mathbf{x}}^{\prime}}_{0i} = \hat{\mathbf{x}}_{0i} + {\mathbf{E}}_i {\mathbf{p}}_0 ,\qquad \quad {\hat{\mathbf{v}}^{\prime}}_i = \hat{\mathbf{v}}_i + {\mathbf{E}}_i \dot{\mathbf{p}},\qquad {\hat{\mathbf{z}}^{\prime}}_{k,T} = \hat{\mathbf{z}}_{k,T} - {\mathbf{p}}_0 - (t_k - t_0 )\dot{\mathbf{p}}. \end{aligned} $$
(435)
A regression of the SLR displacement transformation parameters dk,S within \(\hat {\mathbf {z}}_{k,S} \), will give the parameters of the linear trend \(\hat {\mathbf {d}}_{0,S} \), \(\hat {\dot {\mathbf {d}}}_S \) which lead to a geocentric solution, while a regression of the SLR scale transformation parameters \(\hat {s}_{k,S(G)} \) within \(\hat {\mathbf {z}}_{k,S(G)} \) will give the parameters of the linear trend \(\hat {s}_{0S} \), \(\hat {\dot {s}}_S \) which will lead to a solution adapted to the SLR scale. Applying both we obtain the geocentric solution with SLR scale
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_{0i(G,S)} = \hat{\mathbf{x}}_{0i} + \hat{\mathbf{d}}_{0,S} + \hat{s}_{0S} {\mathbf{x}}_i^{ap} ,\qquad \hat{\mathbf{v}}_{i(G,S)} = \hat{\mathbf{v}}_i + \hat{\dot{\mathbf{d}}}_S + \hat{\dot{s}}_{S + V} {\mathbf{x}}_i^{ap} . \end{aligned} $$
(436)
Alternatively a regression of the VLBI scale transformation parameters \(\hat {s}_{k,V} \) within \(\hat {\mathbf {z}}_{k,V} \) will give the parameters of the linear trend \(\hat {s}_{0V} \), \(\hat {\dot {s}}_V \) which combined with \(\hat {\mathbf {d}}_{0,S} \), \(\hat {\dot {\mathbf {d}}}_S \) will lead to a geocentric solution adapted to the VLBI scale
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_{0i(G,V)} = \hat{\mathbf{x}}_{0i} + \hat{\mathbf{d}}_{0,S} + \hat{s}_{0V} {\mathbf{x}}_i^{ap} ,\qquad \hat{\mathbf{v}}_{i(G,V)} = \hat{\mathbf{v}}_i + \hat{\dot{\mathbf{d}}}_S + \hat{\dot{s}}_V {\mathbf{x}}_i^{ap} . \end{aligned} $$
(437)
For an in-between solution with scale inherited from both SLR and VLBI one may use, instead of \(\hat {s}_{0S} \), \(\hat {\dot {s}}_S \) or \(\hat {s}_{0V} \), \(\hat {\dot {s}}_V \), a weighted combination
$$\displaystyle \begin{aligned} s_{0,S + V} = \lambda s_{0V} + (1 - \lambda )s_{0S} ,\qquad \dot{s}_{S + V} = \lambda \dot{s}_V + (1 - \lambda )\dot{s}_S ,\qquad \quad 0 \le \lambda \le 1, \end{aligned} $$
(438)
which leads to a solution with combined scale
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_{0i(G,S + V)} = \hat{\mathbf{x}}_{0i} + \hat{\mathbf{d}}_{0,S} + \hat{s}_{0,S + V} {\mathbf{x}}_i^{ap} ,\qquad \qquad \hat{\mathbf{v}}_{i(G,S + V)} = \hat{\mathbf{v}}_i + \hat{\dot{\mathbf{d}}}_S + \hat{\dot{s}}_{S + V} {\mathbf{x}}_i^{ap} . \end{aligned} $$
(439)
For λ = 0 one obtains a solution adapted to the SLR scale only, while for λ = 1, the solution is adapted to only the VLBI scale. The question of whether to use the simple restricted approach, or the above, slightly more complicated, extended approach, cannot be answered by theoretical means only. Additional information is needed on the nature of the effects of systematic errors on the estimated per epoch solutions, the main question being whether they can be effectively reduced by removing a trend of the form of a similarity transformation.

The two step approach described in chapter 17, which is applicable only in the case of (incorrectly) non-singular per technique normal equation coefficient matrices, has the advantage that the problems of scale and origin can be attacked in a much simpler way in the second combination step. Recall that station initial coordinates and velocity estimates \(\hat {\mathbf {x}}_{0i,T\vert T} \), \(\hat {\mathbf {v}}_{i,T\vert T} \), obtained from the stacking of each technique T separately, are the data in a combination with observation equations model \(\hat {\mathbf {x}}_{0i,T\vert T} = {\mathbf {x}}_{0i} + {\mathbf {E}}_i {\mathbf {p}}_{0T} + {\mathbf {e}}_{{\mathbf {x}}_{0i,T} } \) and \(\hat {\mathbf {v}}_{iT\vert T} = {\mathbf {v}}_i + {\mathbf {E}}_i \dot {\mathbf {p}}_T + {\mathbf {e}}_{{\mathbf {v}}_{iT} } \) (Eq. 394), where the unknowns are the final ITRF parameters x0i, vi, plus the nuisance transformation parameters p0T, \(\dot {\mathbf {p}}_T \) (initial epoch and rate, respectively) from the ITRF reference system to that of each particular technique T.

The separate estimates \(\hat {\mathbf {x}}_{0i,T\vert T} \), \(\hat {\mathbf {v}}_{i,T\vert T} \) can be obtained using the restricted or the extended model approach for SLR and/or VLBI. While all seven transformation parameters zk are included for GPS and DORIS, in the restricted model approach zk contains only orientation parameters for SLR, where origin (geocenter) and scale is defined, and only orientation and translation parameters for VLBI where scale is defined. In the combination step, the restricted model approach must be used, with the same sets of parameters included within p0T and \(\dot {\mathbf {p}}_T \) for each technique, as in the restricted model approach within each technique. It is fully justified when the extended model approach has been used in the separate stackings per technique, because in this case the larger part of the systematic errors has been absorbed by the per epoch transformation parameters. Since no scale transformation parameters s0T (within p0T) and \(\dot {s}_T\) (within \(\dot {\mathbf {p}}_T\)) are included for both VLBI and SLR, the resulting ITRF scale is a combination of the scales of VLBI and SLR. If s0T and \(\dot {s}_T \) are excluded for only one of these two techniques, then its scale passes to the ITRF. It is thus possible to obtain two ITRFs, one with VLBI scale and one with SLR scale, which are related by a scale transformation only, and can be combined into an in-between solution with an intermediate weighted average scale. For example if \(\hat {\mathbf {x}}_{0i\vert V} = \hat {\mathbf {x}}_{0i\vert S} + s_0 {\mathbf {x}}_{0i}^{ap} \), \(\hat {\mathbf {v}}_{i\vert V} = \hat {\mathbf {v}}_{i\vert S} + \dot {s}{\mathbf {x}}_{0i}^{ap} \) is the relation between the solution based on SLR scale only (\(\hat {\mathbf {x}}_{0i\vert S} \), \(\hat {\mathbf {v}}_{i\vert S} )\) and the solution based on VLBI scale only (\(\hat {\mathbf {x}}_{0i\vert V} \), \(\hat {\mathbf {v}}_{i\vert V} )\) an in-between solution will be
$$\displaystyle \begin{aligned} \hat{\mathbf{x}}_{0i} {=} \hat{\mathbf{x}}_{0i\vert S} + \lambda s_0 {\mathbf{x}}_{0i}^{ap} = \hat{\mathbf{x}}_{0i\vert V} + (\lambda - 1)s_0 {\mathbf{x}}_{0i}^{ap} ,\quad \hat{\mathbf{v}}_i {=} \hat{\mathbf{v}}_{i\vert S} + \lambda \dot{s}{\mathbf{x}}_{0i}^{ap} = \hat{\mathbf{v}}_{i\vert V} + (\lambda - 1)\dot{s}{\mathbf{x}}_{0i}^{ap} , \end{aligned} $$
(440)
with 0 ≤ λ ≤ 1. λ = 0 provides the solution with only VLBI scale, while λ = 1, the one with only SLR scale.
A particular approach which enables the control the relative “flow” of scale from SLR and VLBI to the ITRF solution, is to model the scale parameters s0T and \(\dot {s}_T \) for SLR and VLBI as stochastic parameters with zero mean and known variances. Splitting
$$\displaystyle \begin{aligned} {\mathbf{E}}_i = [{\mathbf{E}}_{ai} {\mathbf{E}}_{bi} ],\ {\mathbf{p}}_{0T} = \left[ \begin{array}{c} {{\mathbf{p}}_{a,0T} } \\ {p_{b,0T} } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf{p}}_{a,0T} } \\ {s_{0T} } \end{array} \right],\ \dot{\mathbf{p}} = \left[ \begin{array}{c} {\dot{\mathbf{p}}_{a,0T} } \\ {\dot{p}_{b,T} } \end{array} \right] = \left[ \begin{array}{c} {\dot{\mathbf{p}}_{a,0T} } \\ {\dot{s}_T } \end{array} \right], \end{aligned} $$
(441)
where pa,0T and \(\dot {\mathbf {p}}_{a,T} \) contain the orientation (θ0T, \(\boldsymbol {\dot {\theta }}_{0T} )\) and the translation (d0T, \(\dot {\mathbf {d}}_T )\) parameters, the observation equations (394) take the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\hat{\mathbf{x}}_{0i,T\vert T} } \\ {\hat{\mathbf{v}}_{i,T\vert T} } \end{array} \right] = \left[ \begin{array}{cccc} {{\mathbf{I}}_3 } & \mathbf{0} & {{\mathbf{E}}_{a,i} } & \mathbf{0} \\ \mathbf{0} & {{\mathbf{I}}_3 } & \mathbf{0} & {{\mathbf{E}}_{a,i} } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{x}}_{0i,T} } \\ {{\mathbf{v}}_{i,T} } \\ {{\mathbf{p}}_{a,0T} } \\ {\dot{\mathbf{p}}_{a,T} } \end{array} \right] + \left[ \begin{array}{cc} {{\mathbf{E}}_{b,i} } & \mathbf{0} \\ \mathbf{0} & {{\mathbf{E}}_{b,i} } \end{array} \right]\left[ \begin{array}{c} {s_{0T} } \\ {\dot{s}_T } \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{e}}_{{\mathbf{x}}_{0i,T} } } \\ {{\mathbf{e}}_{{\mathbf{v}}_{i,T} } } \end{array} \right]. \end{aligned} $$
(442)
The final observation equations for all stations (395) take the form
$$\displaystyle \begin{aligned} \left[ \begin{array}{c} {\hat{\mathbf{x}}_{0,T\vert T} } \\ {\hat{\mathbf{v}}_{T\vert T} } \end{array} \right] & = \left[ \begin{array}{cccc} {{\mathbf{I}}_{3n} } & \mathbf{0} & {{\mathbf{E}}_a } & \mathbf{0} \\ \mathbf{0} & {{\mathbf{I}}_{3n} } & \mathbf{0} & {{\mathbf{E}}_a } \end{array} \right]\left[ \begin{array}{c} {{\mathbf{x}}_{0T} } \\ {{\mathbf{v}}_T } \\ {{\mathbf{p}}_{a,0T} } \\ {\dot{\mathbf{p}}_{a,0T} } \end{array} \right] + \left[ \begin{array}{cc} {{\mathbf{E}}_b } & \mathbf{0} \\ \mathbf{0} & {{\mathbf{E}}_b } \end{array} \right]\left[ \begin{array}{c} {s_{0T} } \\ {\dot{s}_T } \end{array} \right] + \left[ \begin{array}{c} {{\mathbf{e}}_{{\mathbf{x}}_{0T}} } \\ {{\mathbf{e}}_{{\mathbf{v}}_{T}} } \end{array} \right],\\ T& = V,S,G,D, \end{aligned} $$
(443)
which is a mixed linear model with both deterministic and stochastic parameters and can be adjusted by standard relevant techniques. The variances \(\sigma _{s_{0T} }^2 \), \(\sigma _{\dot {s}_T }^2 \) for VLBI and SLR can be used to control the relative influence of the VLBI and SLR scale on the final ITRF scale. The smaller the variance, the greater the influence of the corresponding technique on the ITRF scale.

19 Post-linear Coordinate Variation Models

Examination of the residuals Δxi(tk) = xi(tk) −x0i − (tk − t0)vi of coordinate time series after a linear trend x0i + (tk − t0)vi has been removed, reveals a systematic behavior and calls for the additional modeling for non-linear temporal variation. There are three requirements that such a model should fulfill [31]:
  1. (a)

    analytical description in terms of a finite number of parameters, so that coordinates at any particular time instant can be computed.

     
  2. (b)

    adaptability to the main characteristics of the coordinate time series variation, so that only a high frequency part that can be attributed to observational noise is left out.

     
  3. (c)

    description in terms of parameters that are appropriate for a physical interpretation.

     
Examination of the residuals Δxi(tk) demonstrate a quasi-periodic annual variation. The corresponding spectra obtained by a discrete Fourier transform reveal peaks close to annual and semi-annual frequencies, although these peaks do not stand out clearly, and contribution from other frequencies is present. In any case, the most obvious choice is that of a Fourier series model
$$\displaystyle \begin{aligned} x(t) = \sum_j {\left[ {a_j \cos \frac{2\pi t}{T_j } + b_j \sin \frac{2\pi t}{T_j }} \right]},\end{aligned} $$
(444)
with frequencies fj = 1∕Tj that include at least an annual (Tj = 1 year) and a semi-annual (Tj = 6 months) term. In fact, the last ITRF version provides the coefficients aj, bj of annual and semi-annual terms, although they are not yet included in the official ITRF2014 version [7]. In addition it includes models for station coordinate variations associated with post seismic deformation.

Removal of such periodic terms leaves out a systematic part that varies from station to station and is more intense in the height component. Thus the above model fulfills greatly the requirements of analytical description and physical interpretation but falls somewhat short as far as adaptability is concerned. It is obvious that some additional non-periodic signal is present in coordinate time series.

Another popular approach is the Singular Spectral Analysis (SSA) for a discrete time series (see e.g., [36, 37] for a general exposition and [22, 53] for geodetic applications). The original time series {xk} is replaced with a smoothed time series \(\{{x}^{\prime }_k \}\), so that the method is essentially a filtering method. It demonstrates excellent adaptability but falls short with respect to the other two requirements, analytical description and physical interpretation. Our own rather extensive computational experience has shown that despite its profound theoretical foundation, results indistinguishable from those of SSA can be also obtained by a simple moving average (\({x}^{\prime }_k = \frac {1}{2M + 1}\sum \nolimits _{j = k - M}^{k + M} {x_j } )\) with appropriate window length 2k + 1. Two more methods which provide both adaptability and analytical description have been proposed by Chatzinikos and Dermanis [19]. The first is the use of cubic splines with equidistant nodes which are least-squares fitted to the coordinate time series. The interpolating function is defined in a piece-wise manner by cubic polynomials Si(t) which share common values, and common first and second derivatives at the bordering points (nodes). There are two equivalent models
$$\displaystyle \begin{aligned} x(t) {=} S_i (t) &= g_{i - 1} {+} N_{i - 1} (t - \tau_{i - 1} ) {+} \left[ {3\frac{g_i - g_{i - 1} }{h^2} - \frac{2N_{i - 1} + N_i }{h}} \right](t - \tau_{i - 1} )^2 + \\ &\quad + \left[ { - 2\frac{g_i - g_{i - 1} }{h^3} - \frac{N_{i - 1} + N_i }{h^2}} \right](t - \tau_{i - 1} )^3,\qquad i = 1,2,\ldots ,n, \end{aligned} $$
(445)
and
$$\displaystyle \begin{aligned} x(t) &= S_i (t) = g_{i - 1} + \left[ {\frac{g_i - g_{i - 1} }{h^2} - \frac{2M_{i - 1} + M_i }{6}h} \right](t - \tau_{i - 1} )\frac{M_{i - 1} }{2}(t - \tau_{i - 1} )^2+\\ &\quad + \frac{M_i - M_{i - 1} }{6\,h}(t - \tau_{i - 1} )^3, \qquad i = 1,2,\ldots ,n, \end{aligned} $$
(446)
where τi, i = 0, 1, …, n, are the spline nodes, each spline Si(t) applies only to the interval τi−1 ≤ t ≤ τi, and h = τi+1 − τi is the distance between the nodes,
$$\displaystyle \begin{aligned} g_i & = S_i (\tau_i ) = S_{i + 1} (\tau_i ),\qquad \quad N_i = \frac{dS_i }{dt}(\tau_i ) = \frac{dS_{i + 1} }{dt}(\tau_i ),\\ M_i &= \frac{d^2S_i }{dt^2}(\tau_i ) = \frac{d^2S_{i + 1} }{dt^2}(\tau_i ). \end{aligned} $$
(447)
Model (445) is accompanied by the conditions of equality of the second derivatives
$$\displaystyle \begin{aligned} hN_{i - 1} + 4hN_i + hN_{i + 1} = 3(g_{i + 1}-g_{i - 1} ),\qquad i = 1,2,\ldots ,n - 1, \end{aligned} $$
(448)
while model (446) is accompanied by the conditions of equality of the first derivatives
$$\displaystyle \begin{aligned} M_{i - 1} + 4M_i + M_{i - 1} = \frac{6}{h^2}(g_{i - 1} - 2g_i + g_{i + 1} ),\qquad i = 1,2,\ldots ,n - 1. \end{aligned} $$
(449)
The unknown parameters gi, Ni, i = 0, 1, …, n or gi, Mi, depending on the model used, are estimated by least squares fit of the observed values xk = x(tk), using the standard method of least squares adjustment with linear constraints. The method demonstrates excellent adaptability, with results similar to those of the SSA. It also has the advantage of analytical description, but the derived coefficients are not fit for physical interpretation.
For this reason, another method has been developed that fulfills all three requirements. It is based on the observation that the departure of the annual coordinate variation from perfect periodicity is due to its association with hydrological factors and ultimately with weather patterns. Although the four seasons are repeated annually, winters and summers do not have the same intensity every year (some summers are hotter, some winders colder) and the associated weather phenomena do not arrive exactly on specific dates (some years winter comes earlier, some others later and the same holds for summer). This leads us to take the basic annual periodic signal \(a_0 \cos (\omega _0 t_0 ) + b_0 \sin (2\pi / T_0 )\), with ω0 = 2πT0, T0 = 1 year, rewrite into its polar form \(A_0 \cos (\omega _0 t - \varphi _0 )\), with amplitude \(A_0 = \sqrt {a_0^2 + b_0^2 } \) and phase \(\varphi _0 = \arctan (b_0 / a_0 )\) and try to modified it in a way that conforms with physical reality. Indeed, this monochromatic signal will serve as the carrier on which both time dependent amplitude and phase will be modulated to obtain the representation
$$\displaystyle \begin{aligned} x(t) = A(t)\cos (\omega_0 t - \varphi (t)).\end{aligned} $$
(450)
The unknown amplitude and phase functions can be modeled in a simple way. We choose a piecewise linear representation
$$\displaystyle \begin{aligned} A(t) & = A_{i - 1} + \frac{A_i - A_{i - 1} }{\tau_i - \tau_{i - 1} }(t - \tau_{i - 1} ),\qquad \varphi (t) = \varphi_{i - 1} + \frac{\varphi_i - \varphi_{i - 1} }{\tau_i - \tau_{i - 1} }(t - \tau_{i - 1} ),\\ &\quad \tau_{i - 1} \le t \le \tau_i , \end{aligned} $$
(451)
where τi, i = 0, 1, …, n, are the piece nodes and Ai = A(τi), φi = φ(τi). A simple least square fit to the observed values xk = x(tk) for the determination of the parameter values Ai, φi, has proven to suffer from over-adaptability, even for large distance between successive nodes. The interpolating functions tends to absorb also high frequency variations due to observational errors. For this reason the least squares approach (b = Ax + e, \({\mathbf {e}}^T\mathbf {Pe} = \min \)) has been replaced by a Tikhonov regularization (\({\mathbf {e}}^T\mathbf {Pe} + {\mathbf {x}}^T\mathbf {Wx} = \min \)) with
$$\displaystyle \begin{aligned} {\mathbf{x}}^T\mathbf{Wx} = \rho_A \sum_{i = 0}^n {(A_i - A_i^0 ) + \rho_\varphi } \sum_{i = 0}^n {(\varphi_i - \varphi_i^0 ),} \end{aligned} $$
(452)
where ρA, ρφ are regularization parameters. The reference values \(A_i^0 \), \(\varphi _i^0 \) are obtained as follows: For every observation epoch tk, a periodic signal \(A_k \cos (\omega _k t - \varphi _k )\), with annual period is best fitted to all data within a year-long moving window having tk as its center. The resulting dense data Ak, φk are averaged, or best fitted by piecewise linear functions, to obtain the required reference values \(A_i^0 \), \(\varphi _i^0 \) at the nodes. The regularization parameters ρA, ρφ and the node distance h = τi − τi−1 can be used as free parameters, in order to tune the adaptability of the method. Under particular choices one obtains results practically identical to those of the SSA (or the simple moving average filter) and the approach with equidistant cubic splines. There is a great difference though; the parameters Ai, φi (or the linearly interpolated values A(t), φ(t)) are most appropriate for comparison with hydrological-meteorological data. Therefore, unlike the other approaches, the representation with amplitude and phase modulation, is the only one that fulfills all three requirements: analytical description, adaptability and physical interpretation.

References

  1. 1.
    Altamimi, Z., Dermanis, A.: The choice of reference system in ITRF formulation. In: Sneeuw, N., et al. (eds.) VII Hotine-Marussi Symposium on Mathematical Geodesy, International Association of Geodesy, Symposia, vol. 137, pp. 329–334. Springer, Berlin (2009)CrossRefGoogle Scholar
  2. 2.
    Altamimi, Z., Dermanis, A.: Theoretical foundations of ITRF determination. The algebraic and the kinematic approach. In: Katsampalos, K.V., Rossikopoulos, D., Spatalas, S., Tokmakidis, K. (eds.) On Measurements of Lands and Constructions. Volume in honor of Prof. Dimitios G. Vlachos. Publication of the School of Rural & Surveying Engineering, Aristotle University of Thessaloniki, pp. 331–359 (2013)Google Scholar
  3. 3.
    Altamimi, Z., Sillard, P., Boucher, C.: ITRF2000: a new release of the international terrestrial reference frame for earth science applications. J. Geophys. Res. 107(B10), 2214 (2002)CrossRefGoogle Scholar
  4. 4.
    Altamimi, Z., Sillard, P., Boucher, C.: ITRF2000: from theory to implementation. In: Sansò, F. (ed.) V Hotine–Marussi Symposium on Mathematical Geodesy. IAG Symposia, vol. 127, pp. 157–163. Springer, Berlin (2004)CrossRefGoogle Scholar
  5. 5.
    Altamimi, Z., Collilieux, X., Legrand, J., Garayt, B., Boucher, C.: ITRF2005: a new release of the international terrestrial reference frame based on time series of station positions and earth orientation parameters. J. Geophys. Res. 112, B09401 (2007)CrossRefGoogle Scholar
  6. 6.
    Altamimi, Z., Collilieux, X., Métivier, L.: ITRF2008: an improved solution of the international terrestrial reference frame. J. Geod. 85, 457–473 (2011)CrossRefGoogle Scholar
  7. 7.
    Altamimi, Z., Rebischung, P., Métivier, L., Collilieux, X.: ITRF2014: a new release of the international terrestrial reference frame modeling nonlinear station motions. J. Geophys. Res. Solid Earth 121, 6109–6131 (2016)CrossRefGoogle Scholar
  8. 8.
    Angermann, D., Drewes, H., Krügel, M., Meisel, B., Gerstl, M., Kelm, R., Müller, H., Seemüller, W., Tesmer, V.: ITRS Combination Center at DGFI: A Terrestrial Feference Frame Realization 2003. Deutsche Geodätische Kommission Reihe B Nr. 313, München (2004)Google Scholar
  9. 9.
    Angermann, D., Drewes, H., Gerstl, M., Krügel, M., Meisel, B.: DGFI combination methodology for ITRF2005 computation. In: Drewes, H. (ed.) Geodetic Reference Frames. IAG Symposia, vol. 134, pp. 11–16. Springer, Berlin (2009)CrossRefGoogle Scholar
  10. 10.
    Artz, T., Bernhard, L., Nothnagel, A., Steigenberger, P., Tesmer, S.: Methodology for the combination of sub-daily Earth rotation from GPS and VLBI observations. J. Geod. 86, 221–239 (2012)CrossRefGoogle Scholar
  11. 11.
    Baarda, W.: S-Transformations and Criterion Matrices. Netherlands Geodetic Commission, Publ in Geodesy, New Series, vol. 5, no. 1, Delft (1973). https://www.ncgeo.nl/downloads/18Baarda.pdf
  12. 12.
    Baarda, W.: Linking up spatial models in geodesy. Extended S-Transformations. Netherlands Geodetic Commission, Publ. in Geodesy, New Series, no. 41, Delft (1995). https://www.ncgeo.nl/downloads/41Baarda.pdf
  13. 13.
    Biagi, L., Sanso, F.: Sistemi di riferimento in geodesia: algebra e geometria die minimi quadrati per un modello con deficienza di rango. Bollettino di Geodesia e Scienze Affini. Parte prima: Anno LXII, N. 4, 261–284. Parte seconda: Anno LXIII, N. 1, 29–52. Parte terza: Anno LXIII, N. 2, 129–149 (2003)Google Scholar
  14. 14.
    Bjerhammar, A.: Rectangular reciprocal matrices with special emphasis to geodetic calculations. Bulletin Géodésique 52, 188–220 (1951)CrossRefGoogle Scholar
  15. 15.
    Blaha, G.: Inner adjustment constraints with emphasis on range observations. Department of Geodetic Science, Report 148, The Ohio State University, Columbus (1971)Google Scholar
  16. 16.
    Blaha, G.: Free networks: minimum norm solution as obtained by the inner adjustment constraint method. Bull Géodésique 56, 209–219 (1982)CrossRefGoogle Scholar
  17. 17.
    Bolotin, S., Bizouard, C., Loyer, S., Capitaine, N.: High frequency variations of the earth’s instantaneous angular velocity vector. Determination by VLBI data analysis. Astron. Astrophys. 317, 601–609 (1997)Google Scholar
  18. 18.
    Capitaine, N., Guinod, B., Souchay, J.: A non-rotating origin of the instantaneous equator: definition, properties and use. Cel. Mech. 39, 283–307 (1986)CrossRefGoogle Scholar
  19. 19.
    Chatzinikos, M., Dermanis, A.: A comparison of existing and new methods for the analysis of nonlinear variations in coordinate time series. In: IUGG 2015, Prague, 22 June–3 July 2015. Available at: https://www.researchgate.net
  20. 20.
    Chatzinikos, M., Dermanis, A.: A coordinate-invariant model for deforming geodetic networks: understanding rank deficiencies, non-estimability of parameters, and the effect of the choice of minimal constraints. J. Geod. 91, 375–396 (2017)CrossRefGoogle Scholar
  21. 21.
    Chatzinikos, M., Dermanis, A.: Interpretation of numerically detected rank defects in GNSS data analysis problems in terms of deficiencies in reference system definition. GPS Solutions 21, 1239–1250 (2017)CrossRefGoogle Scholar
  22. 22.
    Chen, Q., van Dam, T., Sneeuw, N., Collilieux, X., Weigelt, M., Rebischung, P.: Singular spectrum analysis for modeling seasonal signals from GPS time series. J. Geodyn. 72, 25–35 (2013)CrossRefGoogle Scholar
  23. 23.
    Dermanis, A.: The Non-Linear and the Space-Time Datum problem. Paper presented at the Meeting “Mathematische Methoden der Geodaesie”, Mathematisches Forschungsinstitut Oberwolfach, 1–7 Oct 1995. Available at: http://der.topo.auth.gr, https://www.researchgate.net/
  24. 24.
    Dermanis, A.: Generalized inverses of nonlinear mappings and the nonlinear geodetic datum problem. J. Geod. 72(2), 71–100 (1998)CrossRefGoogle Scholar
  25. 25.
    Dermanis, A.: Establishing global reference frames. Nonlinear, temporal, geophysical and stochastic aspects. Invited paper presented at the IAG international symposium Banff, Alberta, 31 July–4 Aug 2000 (2000). In: Sideris, M.G. (ed) Gravity, Geoid and Geodynamics”, IAG Symposia, vol. 123, pp. 35–42. Springer, Berlin (2002)Google Scholar
  26. 26.
    Dermanis, A.: Global reference frames: connecting observation to theory and geodesy to geophysics. In: IAG 2001 Scientific Assembly “Vistas for Geodesy in the New Milennium”, Budapest, 2–8 Sept 2001. Available at http://der.topo.auth.gr, https://www.researchgate.net/
  27. 27.
    Dermanis, A.: Some remarks on the description of earth rotation according to the IAU 2000 resolutions. From Stars to Earth and Culture. In honor of the memory of Professor Alexandros Tsioumis, pp. 280–291. School of Rural & Surveying Engineering, The Aristotle University of Thessaloniki (2003)Google Scholar
  28. 28.
    Dermanis, A.: The rank deficiency in estimation theory and the definition of reference frames. In: Sansò, F. (ed.) V Hotine-Marussi Symposium on Mathematical Geodesy, Matera, 17–21 June 2003. International Association of Geodesy Symposia, vol. 127, pp. 145–156. Springer, Heidelberg (2003)Google Scholar
  29. 29.
    Dermanis, A.: Coordinates and Reference Systems. Ziti Publications, Thessaloniki (2005)Google Scholar
  30. 30.
    Dermanis, A.: Compatibility of the IERS earth rotation representation and its relation to the NRO conditions. Proceedings, Journées 2005 Systèmes de Référence Spatio-Temporels “Earth dynamics and reference systems: five years after the adoption of the IAU 2000 Resolutions”, Warsaw, 19–21 Sept 2005, pp. 109–112 (2005)Google Scholar
  31. 31.
    Dermanis, A.: The ITRF beyond the “Linear” model. Choices and challenges. In: Xu, P., Liu, J., Dermanis, A.: (eds.) VI Hotine-Marussi Symposium on Theoretical and Computational Geodesy. International Association of Geodesy Symposia, vol. 132, pp. 111–118. Springer (2006) (Invited presentation at the VI Hotine-Marussi Symposium, Wuhan, 29 May–2 June 2006)Google Scholar
  32. 32.
    Dermanis, A.: On the alternative approaches to IITRF formulation. A theoretical comparison. IUGG General Assembly, Melbourne. In: Rizos, C., Willis, P. (eds.) Earth on the Edge: Science for a Sustainable Planet, International Association of Geodesy Symposia, vol. 139, pp. 223–229. Springer, Berlin/Heidelberg (2014)CrossRefGoogle Scholar
  33. 33.
    Dermanis, A.: Global reference systems: theory and open questions. Invited paper at the Academia dei Lincei Session, VIII Hotine-Marussi Symposium on Mathematical Geodesy, Rome, 17–21 June 2013. In: Sneeuw, N., Novák, P., Crespi, M., Sansò, F. (eds.) VIII Hotine-Marussi Symposium on Mathematical Geodesy, IAG Symposia, vol. 142, pp. 9–16. Springer International Publishing, Switzerland (2016)Google Scholar
  34. 34.
    Dermanis, A., Sansò, F.: Different equivalent approaches to the geodetic reference system. Rendiconti della Accademia dei Lincei, Scienze fisiche e naturali. On-Line-First (volume in print) (2018)Google Scholar
  35. 35.
    Dow, J., Neilan, R.E., Rizos, C.: The international GNSS service in a changing landscape of global navigation satellite systems. J. Geod. 83(3–4), 191–198 (2009). https://doi.org/10.1007/s00190-008-0300-3 CrossRefGoogle Scholar
  36. 36.
    Elsner, J.B., Tsonis, A.A.: Singular Spectrum Analysis. A New Tool in Time Series Analysis. New York, Plenum Press (1996)CrossRefGoogle Scholar
  37. 37.
    Golyandina, N., Zhigljavsky, A.: Singular Spectrum Analysis for Time Series. Springer Briefs in Statistics. Springer (2013). ISBN:978-3-642-34912-6CrossRefGoogle Scholar
  38. 38.
    Grafarend, E., Schaffrin, B.: Unbiased free net adjustment. Surv. Rev. 22(171), 200–218 (1974)CrossRefGoogle Scholar
  39. 39.
    Grafarend, E., Schaffrin, B.: Equivalence of estimable quantities and invariants in geodetic networks. Zeitschrift für Vemessungswesen 101(11), 485–491 (1976)Google Scholar
  40. 40.
    Gross, J.: The general Gauss-Markov model with possibly singular dispersion matrix. J. Stat. Pap. 45, 311–336 (2004)CrossRefGoogle Scholar
  41. 41.
    Koch, K.-R.: Parameter estimation and hypothesis testing in linear models, 2nd edn. Springer, Berlin (1999)CrossRefGoogle Scholar
  42. 42.
    Kotsakis, C.: Generalized inner constraints for geodetic network densification problems. J. Geodesy 87, 661–673 (2013)CrossRefGoogle Scholar
  43. 43.
    Lavallée, D.A., van Dam, T., Blewitt, G., Clarke, P.J.: Geocenter motions from GPS: a unified observation model. J. Geophys. Res. Solid Earth 111(B5) (2006). https://doi.org/10.1029/2005JB003784 CrossRefGoogle Scholar
  44. 44.
    Meindl, M., Beutler, G., Thaller, D., Dach, R., Jäggi, A.: Geocenter coordinates estimated from GNSS data as viewed by perturbation theory. Adv. Space Res. 51(7), 1047 (2013)CrossRefGoogle Scholar
  45. 45.
    Meissl, P.: Die innere Genauigkeit eines Punkthaufens. Österreichers Zeitschrift für Vermessungswesen 50, 159–165 and 186–194 (1962)Google Scholar
  46. 46.
    Meissl, P.: Uber die innere Genauigkeit dreidimensionalern Punkthaufen. Zeitschrift für Vermessungswesen, 1965, 90. Jahrgang, Heft 4, 109–118 (1965)Google Scholar
  47. 47.
    Meissl, P.: Zusammenfassung und Ausbau der inneren Fehlertheorie eines Punkthaufens. Deutsche Geodätische Kommission, Reihe A, Nr. 61, 8–21 (1969)Google Scholar
  48. 48.
    Moore, E.H.: On the reciprocal of the general algebraic matrix. Bull. Am. Math. Soc. 26(9), 394–95 (1920)CrossRefGoogle Scholar
  49. 49.
    Munk, W.H., MacDonald, G.J.F.: The Rotation of the Earth. Cambridge University Press, London (1960)Google Scholar
  50. 50.
    Penrose, R.: A generalized inverse for matrices. Proc. Cambridge Philos. Soc. 51, 406–413 (1955)CrossRefGoogle Scholar
  51. 51.
    Pearlman, M.R., Degnan, J.J., Bosworth, J.M.: The international laser ranging service. Adv. Space Res. 30(2), 135–143 (2002)CrossRefGoogle Scholar
  52. 52.
    Petit, G., Luzum, B.: IERS Conventions. IERS Technical Note No. 36, Verlag des Bundesamts für Kartographie und Geodäsie, Frankfurt am Main 2010. Working version under continuous updating is available at http://iers-conventions.obspm.fr/updates/2010updatesinfo.php (2010)
  53. 53.
    Rangelova, E., van der Wal, W., Sideris, M.G., Wu, P.: Spatiotemporal analysis of the GRACE-derived mass variations in North America by means of multi-channel singular spectrum analysis. In: Mertikas, S.P. (ed.) Gravity, Geoid and Earth Observation, International Association of Geodesy Symposia, vol. 135, pp. 539–546. Springer, Berlin/Heidelberg (2010)CrossRefGoogle Scholar
  54. 54.
    Rao, C.R.: Unified Theory of Linear Estimation. Sankhya, Series A, vol. 33, pp. 371–394 (1971). Corrigenda. Sankhya, Series A, Springer, vol. 34, pp. 194, 477 (1972)Google Scholar
  55. 55.
    Rao, C.R.: Unified theory of least squares. Commun. Stat. Theory Methods 1(1), 1–8 (1973)Google Scholar
  56. 56.
    Rao, C.R.: Linear Statistical Inference and Its Applications, 2nd edn. Wiley, New York (1973)CrossRefGoogle Scholar
  57. 57.
    Rebischung, P., Altamimi, Z., Springer, T.: A colinearity diagnosis of the GNSS geocenter determination. J. Geod. 88(1), 65–85 (2014). https://doi.org/10.1007/s00190-013-0669-5 CrossRefGoogle Scholar
  58. 58.
    Rothacher, M., Angermann, D., Artz, T., Bosch, W., Drewes, H., Gerstl, M., Kelm, R., König, D., König, R., Meisel, B., Müller, H., Nothnagel, A., Panafidina, N., Richter, B., Rudenko, S., Schwegmann, W., Seitz, M., Steigenberger, P., Tesmer, S., Tesmer, V., Thaller, D.: GGOS-D: homogeneous reprocessing and rigorous combination of space geodetic observations. J. Geod. 85, 679–705 (2011)CrossRefGoogle Scholar
  59. 59.
    Schuh, H., Behrend, D.: VLBI: a fascinating technique for geodesy and astrometry. J. Geodyn. 61, 68–80 (2012). https://doi.org/10.1016/j.jog.2012.07.007 CrossRefGoogle Scholar
  60. 60.
    Seitz, M., Angermann, D., Blossfeld, M., Drewes, H., Gerstl, M.: The 2008 DGFI realization of the ITRS: DTRF2008. J. Geod. 86, 1097–1123 (2012)CrossRefGoogle Scholar
  61. 61.
    Tisserand, F.: Traité de Mécanique Céleste. Gauthieu-Villars, Paris (1889)Google Scholar
  62. 62.
    Willis, P., Fagard, H., Ferraged, P., Lemoinee, F.G., Noll, C.E., Noomen, R., Otten, M., Ries, J.C., Rothacher, M., Soudarin, L., Tavernier, G., Valette, J.-J.: The international DORIS service: toward maturity. Adv. Space Res. 45(12), 1408–1420 (2010). https://doi.org/10.1016/j.asr.2009.11.018 CrossRefGoogle Scholar
  63. 63.
    Zhu, S.-Y., Mueller, I.I.: Effects of adopting new precession, nutation and equinox corrections on the terrestrial reference frame. Bull. Geod. 57(1983), 29–42 (1983)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Geodesy and Surveying (DGS)Aristotle University of ThessalonikiThessalonikiGreece

Section editors and affiliations

  • Willi Freeden
    • 1
  1. 1.Geomathematics GroupUniversity of KaiserslauternKaiserslauternGermany

Personalised recommendations