Interferometer Techniques for Astrometry and Geodesy

Thompson, A. Richard; Moran, James M.; Swenson, George W.

doi:10.1007/978-3-319-44431-4_12

Interferometer Techniques for Astrometry and Geodesy

A. Richard Thompson¹⁶,
James M. Moran¹⁷ &
George W. Swenson Jr.¹⁸

Chapter
Open Access
First Online: 23 February 2017

32k Accesses

Part of the book series: Astronomy and Astrophysics Library ((AAL))

Abstract

This chapter is concerned with the techniques by which angular positions of radio sources can be measured with the greatest possible accuracy, and with the design of interferometers for optimum determination of source-position, baseline, and geodetic parameters.

You have full access to this open access chapter, Download chapter PDF

This chapter is concerned with the techniques by which angular positions of radio sources can be measured with the greatest possible accuracy, and with the design of interferometers for optimum determination of source-position, baseline, and geodetic^{Footnote 1} ^{Footnote 2} parameters.

The total fringe phase of an interferometer, where the effect of delay tracking is removed, can be expressed in terms of the scalar product of the baseline and source-position vectors D and s, respectively, as

$$\displaystyle{ \phi = \frac{2\pi } {\lambda } \mathbf{D}\boldsymbol{\,\cdot \,}\mathbf{s} = \frac{2\pi } {\lambda } D\,\cos \theta \;, }$$

(12.1)

where θ is the angle between D and s. Up to this point, we have assumed that these factors are describable by constants that can be specified with high accuracy. However, the measurement of source positions to an accuracy better than a milliarcsecond (mas) requires, for example, that variation in the Earth’s rotation vector be taken into account. The baseline accuracy is comparable to that at which variation in the antenna positions resulting from crustal motions of the Earth can be detected. The calibration of the baseline and the measurement of source positions can be accomplished in a single observing period of one or more days. Geodetic data are obtained from repetition of this procedure over intervals of months or years, which reveals the variation in the baseline and Earth-rotation parameters.

The redefinition of the meter from a fundamental to a derived quantity has an interesting implication for the units of baseline length derived from interferometric data. An interferometer measures the relative time of arrival of the signal wavefront at the two antennas, that is, the geometric delay. Baselines determined from interferometric data are therefore in units of light travel time. Conversion to meters formerly depended on the value chosen for c. However, in 1983, the Conférence Générale des Poids et Mesures adopted a new definition of the meter: “the meter is the length of the path traveled by light in vacuum during a time interval of 1/299,792,458 of a second.” The second and the speed of light are now primary quantities, and the meter is a derived quantity. Thus, baseline lengths can be given unambiguously in meters. Issues related to fundamental units are discussed by Petley (1983).

12.1 Requirements for Astrometry

We begin with a heuristic discussion of how baseline and source-position parameters may be determined. A more formal discussion is given in Sect. 12.2.

The phase of the fringe pattern for a tracking interferometer [Eq. (12.1)] can be expressed in polar coordinates (see Fig. 4.2) as

$$\displaystyle{ \phi (H) = 2\pi D_{\lambda }[\sin d\sin \delta +\cos d\cos \delta \cos (H - h)] +\phi _{\mathrm{in}}\;, }$$

(12.2)

where D _λ is the length of the baseline in wavelengths, H and δ are the hour angle and declination of the source, h and d are the hour angle and declination of the baseline, and ϕ _in is an instrumental phase term. We assume for the purpose of this discussion that ϕ _in is a fixed constant, unaffected by the atmosphere and electronic drifts. The hour angle is related to the right ascension α by

$$\displaystyle{ H = t_{s}-\alpha \;, }$$

(12.3)

where t _s is the sidereal time (in VLBI, t _s and H are referred to the Greenwich meridian, whereas in connected interferometry, they are often referred to the local meridian). Consider a short-baseline interferometer that is laid out in exactly the east–west direction, i.e., with d = 0, $h = \frac{\pi } {2}$ [see Ryle and Elsmore (1973)]. Then

$$\displaystyle{ \phi (H) = -2\pi D_{\lambda }\cos \delta \sin H +\phi _{\mathrm{in}}\;, }$$

(12.4)

and the phase goes through one sinusoidal oscillation in a sidereal day. Suppose that the source is circumpolar, i.e., above the horizon for 24 h. From continuous observation of ϕ over a full day, the 2π crossings of ϕ can be tracked so that there are no phase ambiguities. The average value of the geometric term of Eq. (12.4) is zero, so that ϕ _in can be estimated and removed. When the source transits the local meridian, H = 0, the corrected phase will be zero, and the right ascension α can be determined for the time of this transit and Eq. (12.3).

The length of the baseline can be determined by observing sources close to the celestial equator, | δ | ∼ 0, where the dependence of phase on δ is small. With this calibration of D _λ and ϕ _in, the positions of other sources can be determined, i.e., their right ascensions from the transit times of the central fringe and their declinations from the diurnal amplitude of ϕ(H). The source declination can also be found from the rate of change of phase at H = 0. This rate of change of phase is

$$\displaystyle{ \frac{d\phi } {dt}\Bigg\vert _{H=0} = 2\pi D_{\lambda }\omega _{e}\cos \delta \;, }$$

(12.5)

where ω _e = dH∕dt, the rotation rate of the Earth. From Eq. (12.5), it is clear that if the error in (d ϕ∕dt)_H = 0 = σ _f, then the error in position will be

$$\displaystyle{ \sigma _{\delta } \simeq \frac{1} {2\pi D_{\lambda }\omega _{e}\sin \delta }\,\sigma _{f}\;. }$$

(12.6)

Note that accuracy of the derived declinations is poor for sources near the celestial equator. An informative review of the application of these techniques is given by Smith (1952).

In the determination of right ascension, interferometer observations provide relative measurements, that is, the differences in right ascension among different sources. The zero of right ascension is defined as the great circle through the pole and through the intersection of the celestial equator and the ecliptic at the vernal equinox at a specific epoch. The vernal equinox is the point at which the apparent position of the Sun moves from the Southern to the Northern celestial hemisphere. This direction can be located in terms of the motions of the planets, which are well-defined objects for optical observations. It has been related to the positions of bright stars that provide a reference system for optical measurements of celestial position. Relating the radio measurements to the zero of right ascension is less easy, since solar system objects are generally weak or do not contain sharp enough features in their radio structure. In the 1970s, results were obtained from the lunar occultation of the source 3C273B (Hazard et al. 1971) and from measurements of the weak radio emission from nearby stars such as Algol (β Persei) (Ryle and Elsmore 1973; Elsmore and Ryle 1976).

In the reduction of interferometer measurements in astrometry, the visibility data are interpreted basically in terms of the positions of point sources. The data processing is equivalent, in effect, to model fitting using delta-function intensity components, the visibility function for which has been discussed in Sect. 4.4 The essential position data are determined from the calibrated visibility phase or, in some VLBI observations, from the geometric delay as measured by maximization of the cross-correlation of the signals (i.e., the use of the bandwidth pattern) and from the fringe frequency. Because the position information is contained in the visibility phase, measurements of closure phase discussed in Sect. 10.3 are of use in astrometry and geodesy only insofar as they can provide a means of correcting for the effects of source structure. Uniformity of (u, v) coverage is less important than in imaging because high dynamic range is generally not needed. Determination of the position of an unresolved source depends on interferometry with precise phase calibration and a sufficient number of baselines to avoid ambiguity in the position.

12.1.1 Reference Frames

A reference frame based on the positions of distant extragalactic objects can be expected to show greater temporal stability than a frame based on stellar positions and to approach more closely the conditions of an inertial frame. An inertial frame is one that is at rest or in uniform motion with respect to absolute space and not in a state of acceleration or rotation [see, e.g., Mueller (1981)]. Newton’s first law holds in such a frame. A detailed description of astronomical reference frames is given by Johnston and de Vegt (1999). The International Celestial Reference System (ICRS) adopted by the International Astronomical Union (IAU) specifies the zero points and directions of the axes of the coordinate system for celestial positions. The measured positions of a set of reference objects in the coordinates of the reference system provide the International Celestial Reference Frame (ICRF). Thus, the frame provides the reference points with respect to which positions of other objects are measured within the coordinate system.

The most accurate measurements of celestial positions are those of selected extragalactic sources observed by VLBI. Large databases of such high-resolution observations exist as a result of measurements made for purposes of geodesy and astrometry. These measurements have been made in a systematic way mainly since 1979, using VLBI systems with dual frequencies of 2.3 and 8.4 GHz to allow calibration of ionospheric effects. The positions are determined mainly by the 8.4 GHz data. The first catalog of source positions, now called ICRF1 (Ma et al. 1998), was adopted by the IAU in 1998. This frame supersedes earlier ones based on optical positions of stars, most recently those of the FK5 and Hipparcos catalogs. The ICRF1 was based on 1. 6 × 10⁶ measurements of group delay obtained between 1979 and 1995 of 608 sources. Criteria for exclusion of a source included inconsistency in the position measurements, evidence of motion, or presence of extended structure. In this study, 212 sources were found that passed all tests; 294 failed in one criterion; and 102 other sources, including 3C273, failed in several. The 212 sources in the best category were used to define the reference frame. Only 27% of these are in the Southern Hemisphere. A global solution provides the positions of the sources together with the antenna positions and various geodetic and atmospheric parameters. Position errors of the 212 defining sources are mostly less than 0.5 mas in both right ascension and declination and less than 1 mas in almost all cases.

In 2009, an updated reference frame catalog called ICRF2 was released (Fey et al. 2009, 2015) and was adopted by the IAU. It contains the positions of 3,414 sources derived from 6. 5 × 10⁶ measurements of group delay acquired over 30 years through 1998. About 28% of the data came from the VLBA. The core reference frame is based and maintained on data from 295 sources, whose distribution is much more uniform over the celestial sphere than those in ICRF1. The positional accuracy is about 40 μas, about five times better than achieved in ICRF1.

About 50% of sources in the ICRF have redshifts greater than 1.0. The use of such distant objects to define the reference frame provides a level of astrometric uncertainty at least an order of magnitude better than optical measurements of stars. The ultimate accuracy of this frame may depend on the structural stability of the radio sources involved [see, e.g., Fey and Charlot (1997), Fomalont et al. (2011)]. The level of uncertainty in the connection between the radio and optical frames is essentially the uncertainty in optical positions. Radio measurements of the positions of some of the nearer stars provide a comparison between the radio and optical frames. Lestrade (1991) and Lestrade et al. (1990, 1995) have measured the positions of about ten stars by VLBI, achieving accuracy in the range 0.5–1.5 mas. These results provide a link between the ICRF and the star positions in the Hipparcos catalog. The visual magnitudes of the known optical counterparts of the reference frame sources are mostly within the range 15–21, and precise positions of objects fainter than 18th magnitude are likely to be very difficult to obtain.

There are several methods of linking the extragalactic reference frame to the heliocentric reference frame. Pulsar positions can be derived from timing measurements and VLBI measurements (Bartel et al. 1985; Fomalont et al. 1992; and Madison et al. 2013). The timing analysis is inherently linked to the heliocentric frame. VLBI of space probes in orbit around solar system bodies can also help link the frames [see Jones et al. (2015)]. Radio observations of minor planets may be helpful (Johnston et al. 1982).

12.2 Solution for Baseline and Source-Position Vectors

We now discuss in a more formal way how interferometer baselines and source positions can be estimated simultaneously for phase, fringe rate, or group delay measurements. For discussion of early implementations of these techniques, see Elsmore and Mackay (1969), Wade (1970), and Brosche et al. (1973). An excellent tutorial is Fomalont (1995).

12.2.1 Phase Measurements

Consider an observation with a two-element tracking interferometer of arbitrary baseline in which the source is unresolved. Let D _λ be the assumed baseline vector, in units of the wavelength, and (D _λ −Δ D _λ) be the true vector. Similarly, let s be a unit vector indicating the assumed position of the source, and let (s −Δ s) indicate the true position. Note that the convention used is Δ term = (approximate or assumed value) – (true value). The expected fringe phase, using the assumed positions, is $2\pi \mathbf{D}_{\lambda }\,\boldsymbol{\cdot }\,\mathbf{s}$. The observed phase, measured relative to the expected phase, is a function of the hour angle H of the source given by

$$\displaystyle\begin{array}{rcl} \varDelta \phi (H)& =& \,2\pi \,[\mathbf{D}_{\lambda }\,\boldsymbol{\cdot }\,\mathbf{s} - (\mathbf{D}_{\lambda } -\varDelta \mathbf{D}_{\lambda })\,\boldsymbol{\cdot }\,(\mathbf{s} -\varDelta \mathbf{s})] +\phi _{\mathrm{in}} \\ & =& \,2\pi \,(\varDelta \mathbf{D}_{\lambda }\,\boldsymbol{\cdot }\,\mathbf{s} + \mathbf{D}_{\lambda }\,\boldsymbol{\cdot }\,\varDelta \mathbf{s}) +\phi _{\mathrm{in}}\;. {}\end{array}$$

(12.7)

A second-order term involving $\varDelta \mathbf{D}_{\lambda }\,\boldsymbol{\cdot }\,\varDelta \mathbf{s}$ has been neglected since we assume that fractional errors in D _λ and s are small.

The baseline vector can be written in terms of the coordinate system introduced in Sect. 4.1:

$$\displaystyle{ \mathbf{D}_{\lambda } = \left [\begin{array}{*{10}c} X_{\lambda } \\ Y _{\lambda } \\ Z_{\lambda }\\ \end{array} \right ]\;,\qquad \varDelta \mathbf{D}_{\lambda } = \left [\begin{array}{*{10}c} \varDelta X_{\lambda } \\ \varDelta Y _{\lambda } \\ \varDelta Z_{\lambda }\end{array} \right ]\;, }$$

(12.8)

where X _λ, Y _λ, and Z _λ form a right-handed coordinate system with Z _λ parallel to the Earth’s spin axis and X _λ in the meridian plane of the interferometer. The source-position vector can be specified in the (X _λ, Y _λ, Z _λ) system in terms of the hour angle H and declination δ of the source by using Eq. (4.2):

$$\displaystyle{ \mathbf{s} = \left [\begin{array}{*{10}c} s_{X} \\ s_{Y } \\ s_{Z}\\ \end{array} \right ] = \left [\begin{array}{*{10}c} \cos \delta \cos H\\ -\cos \delta \sin H\\ \sin \delta \\ \end{array} \right ]\;. }$$

(12.9)

Taking the differential of Eq. (12.9), we can write

$$\displaystyle{ \varDelta \mathbf{s} \simeq \left [\begin{array}{*{10}c} -\sin \delta \cos H\varDelta \delta +\cos \delta \sin H\varDelta \alpha \\ \sin \delta \sin H\varDelta \delta +\cos \delta \cos H\varDelta \alpha \\ \cos \delta \varDelta \delta \\ \end{array} \right ]\;, }$$

(12.10)

where Δ α and Δ δ are the angular errors in right ascension and declination. Note that Δ α = −Δ H [see Eq. (12.3)].

Consider the case in which there exists a catalog of sources whose positions are considered to be known perfectly. Most connected-element arrays, e.g., ALMA, the VLA, the SMA, and IRAM, have far more antenna pads than antennas, so that the arrays can be reconfigured for various resolutions. Each time an array is reconfigured, the baselines must be redetermined because of the mechanical imprecision of positioning the antennas on the pads. With only baseline errors, i.e., Δ s = 0, the residual phase (substitute Eqs. (12.8) and (12.9) into (12.7)] is

$$\displaystyle{ \varDelta \phi (H) =\phi _{0} +\phi _{1}\cos H +\phi _{2}\sin H\;, }$$

(12.11)

where

$$\displaystyle\begin{array}{rcl} \phi _{0}& =\,& 2\pi \sin \delta \varDelta Z_{\lambda } +\phi _{\mathrm{in}}\;, \\ \phi _{1}& =\,& 2\pi \cos \delta \varDelta X_{\lambda }\;, \\ \phi _{2}& =\,& -2\pi \cos \delta \varDelta Y _{\lambda }\;.{}\end{array}$$

(12.12)

A long track on a single source can be fitted to a sinusoidal function in H with three free parameters, ϕ ₀, ϕ ₁, and ϕ ₂. Δ X _λ and Δ Y _λ can be found from ϕ ₁ and ϕ ₂, respectively. To separate the instrumental term from Δ Z _λ, it is necessary to observe several sources. A simple graphical analysis would be to plot ϕ ₀ vs. sinδ for these sources; Δ Z _λ is given by the slope, i.e., d ϕ ₀∕d(sinδ), and ϕ _in by the sinδ = 0 intercept.

In the general case, as encountered in geodetic applications of VLBI, it is necessary to determine both baseline and source positions. Here, the residual phase [substitute Eqs. (12.8)–(12.10) into (12.7)] is the same as Eq. (12.11) but with

$$\displaystyle\begin{array}{rcl} \phi _{0}& =\,& 2\pi \left (\sin \delta \varDelta Z_{\lambda } + Z_{\lambda }\cos \delta \varDelta \delta \right ) +\phi _{\mathrm{in}}\;, \\ \phi _{1}& =\,& 2\pi \left (\cos \delta \varDelta X_{\lambda } + Y _{\lambda }\cos \delta \varDelta \alpha - X_{\lambda }\sin \delta \varDelta \delta \right )\;, \\ \phi _{2}& =\,& 2\pi \left (-\cos \delta \varDelta Y _{\lambda } + X_{\lambda }\cos \delta \varDelta \alpha + Y _{\lambda }\sin \delta \varDelta \delta \right )\;.{}\end{array}$$

(12.13)

Interleaved observations need to be made of a set of sources over a period of ∼ 12 h. Three parameters (ϕ ₀, ϕ ₁, and ϕ ₂) can be derived for each source. If n _s sources are observed, 3n _s quantities are obtained. The number of unknown parameters required to specify the n _s positions, the baseline, and the instrumental phase (assumed to be constant) is 2n _s + 3; the right ascension of one source is chosen arbitrarily. Thus, if n _s ≥ 3, it is possible to solve for all the unknown quantities. Note that the sources should have as wide a range in declination as possible in order to distinguish Δ Z from ϕ _in in Eq. (12.12). Least-mean-squares analysis provides simultaneous solutions for the instrumental parameters and the source positions. Usually, many more than three sources are observed, so there is redundant information, and variation of the instrumental phase with time as well as other parameters can be included in the solution. A discussion of the method of least-mean-squares analysis can be found in Appendix Appendix 12.1.

Most astronomers are concerned with measuring the position of a source of interest with respect to a nearby calibrator taken from the ICRF or other catalog on an interferometer with well-calibrated baselines. In this case, the phase terms for Eq. (12.11) are

$$\displaystyle\begin{array}{rcl} \phi _{0}& =\,& 2\pi Z_{\lambda }\cos \delta \varDelta \delta +\phi _{\mathrm{in}}\;, \\ \phi _{1}& =\,& 2\pi \left (Y _{\lambda }\cos \delta \varDelta \alpha - X_{\lambda }\sin \delta \varDelta \delta \right )\;, \\ \phi _{2}& =\,& 2\pi \left (X_{\lambda }\cos \delta \varDelta \alpha + Y _{\lambda }\sin \delta \varDelta \delta \right )\;.{}\end{array}$$

(12.14)

However, the fringe visibility of a point source at position l = Δ αcosδ and m = Δ δ is

$$\displaystyle{ V = V _{0}e^{-j2\pi \,(ul+vm)} = V _{ 0}e^{-j\varDelta \phi (H)}\;. }$$

(12.15)

Thus, the source can be imaged by the usual interferometric techniques and its position determined by fitting a Gaussian (or similar) profile to the image plane. The accuracy of position determined in this manner will be limited by the thermal noise to approximately

$$\displaystyle{ \sigma _{\theta } \simeq \frac{1} {2} \frac{\theta _{\mathrm{res}}} {\mathcal{R}_{\mathrm{sn}}}\;, }$$

(12.16)

where θ _res is the resolution of the interferometer, and $\mathcal{R}_{\mathrm{sn}}$ is the signal-to-noise ratio (SNR) [Reid et al. 1988, Condon 1997; see also Eq. (10.68)]. It is shown in Appendix A12.1.3 that the Fourier transform used in imaging is equivalent to a grid parameter search with trial values of α and δ. However, to find baseline parameters or to analyze complex data sets, it is necessary to perform the data analysis in the (u, v) plane.

12.2.2 Measurements with VLBI Systems

The use of independent local oscillators at the antennas in VLBI systems does not easily permit calibration of absolute fringe phase. The earliest method used for obtaining positional information in VLBI was the analysis of the fringe frequency (fringe rate). The fringe frequency is the time rate of change of the interferometer phase. Thus, from Eq. (12.2), the fringe frequency is

$$\displaystyle{ \nu _{f} = \frac{1} {2\pi } \frac{d\phi } {dt} = -\omega _{e}D_{\lambda }\!\cos d\cos \delta \sin (H - h) +\nu _{\mathrm{in}}\;, }$$

(12.17)

where ω _e is the angular velocity of rotation of the Earth (dH∕dt), and ν _in is an instrumental term equal to d ϕ _in∕dt. The component ν _in largely results from residual differences in the frequencies of the hydrogen masers, which provide the local oscillator references at the antennas.

The quantity D _λcosd is the projection of the baseline in the equatorial plane, denoted D _E. Thus, Eq. (12.17) can be rewritten

$$\displaystyle{ \nu _{f} = -\omega _{e}D_{E}\,\cos \delta \sin (H - h) +\nu _{\mathrm{in}}\;. }$$

(12.18)

The polar component of the baseline (the projection of the baseline along the polar axis) does not appear in the equation for fringe frequency. An interferometer with a baseline parallel to the spin axis of the Earth has lines of constant phase parallel to the celestial equator, and the interferometer phase does not change with hour angle. Therefore, the polar component of the baseline cannot be determined from the analysis of fringe frequency.

The usual practice in VLBI is to refer hour angles to the Greenwich meridian. We follow this convention and use a right-handed coordinate system with X through the Greenwich meridian and with Z toward the north celestial pole. Thus, in terms of the Cartesian coordinates for the baseline, Eq. (12.18) becomes

$$\displaystyle{ \nu _{f} = -\omega _{e}\cos \delta \,(X_{\lambda }\sin H + Y _{\lambda }\cos H) +\nu _{\mathrm{in}}\;. }$$

(12.19)

The residual fringe frequency Δ ν _f, that is, the difference between the observed and expected fringe frequencies, can be calculated by taking the differentials of Eq. (12.19) with respect to δ, H, X _λ, and Y _λ and also including the unknown quantity ν _in. We thereby obtain

$$\displaystyle{ \varDelta \nu _{f} = a_{1}\cos H + a_{2}\sin H +\nu _{\mathrm{in}}\;, }$$

(12.20)

where

$$\displaystyle{ a_{1} =\omega _{e}(Y _{\lambda }\sin \delta \varDelta \delta + X_{\lambda }\cos \delta \varDelta \alpha -\cos \delta \varDelta Y _{\lambda }) }$$

(12.21)

and

$$\displaystyle{ a_{2} =\omega _{e}(X_{\lambda }\sin \delta \varDelta \delta - Y _{\lambda }\!\cos \delta \varDelta \alpha -\cos \delta \varDelta X_{\lambda })\;. }$$

(12.22)

Note that Δ ν _f is a diurnal sinusoid and that the average value of Δ ν _f is the instrumental term ν _in. Information about source positions and baselines must come from the two parameters a ₁ and a ₂. Therefore, unlike the case of fringe phase [Eq. (12.11)] where three parameters per source are available, it is not possible to solve for both source and baseline parameters with fringe-frequency data. For example, from observations of n _s sources, 2n _s + 1 quantities are obtained. The total number of unknowns (two baseline parameters, 2n _s source parameters, and ν _in) is 2n _s + 3. If the position of one source is known, the rest of the source positions and X _λ, Y _λ, and ν _in can be determined. Note that the accuracy of the measurements of source declinations is reduced for sources close to the celestial equator because of the sinδ factor in Eqs. (12.21) and (12.22).

As an illustration of the order of magnitude of the parameters involved in fringe-frequency observations, consider two antennas with an equatorial component of spacing equal to 1000 km and an observing wavelength of 3 cm. Then D _E ≃ 3 × 10⁷ wavelengths, and the fringe frequency for a low-declination source is about 2 kHz. Assume that the coherence time of the independent frequency standards is about 10 min. In this period, 10⁶ fringe cycles can be counted. If we suppose that the phase can be measured to 0.1 turn, ν _f will be obtained to a precision of 1 part in 10⁷. The corresponding errors in D _E and angular position are 10 cm and 0. 02^{′ ′}, respectively.

To overcome the limitations of fringe-frequency analysis, techniques for the precise measurement of the relative group delay of the signals at the antennas were developed. The use of bandwidth synthesis to improve the accuracy of delay measurements has been discussed in Sect. 9.8 The group delay is equal to the geometric delay τ _g except that, as measured, it also includes unwanted components resulting from clock offsets at the antennas and atmospheric differences in the signal paths. The fringe phase measured with a connected-element interferometer observing at frequency ν is 2π ν τ _g, modulo 2π. Except for the dispersive ionosphere, the group delay therefore contains the same type of information as the fringe phase, without the ambiguity resulting from the modulo 2π restriction. Thus, group delay measurements permit a solution for baselines and source positions similar to that discussed above for connected-element systems, except that clock offset terms also must be included.

It is interesting to compare the relative accuracies of group delay and the fringe frequency (or, equivalently, the rate of change of phase delay) measurements. The intrinsic precision with which each of these quantities can be measured is derived in Appendix Appendix 12.1 [Eqs. (A12.27) and (A12.34)] and can be written

$$\displaystyle{ \sigma _{f} = \sqrt{ \frac{3} {2\pi ^{2}}}\,\left (\frac{T_{S}} {T_{\!A}}\right ) \frac{1} {\sqrt{\varDelta \nu \tau ^{3}}} }$$

(12.23)

and

$$\displaystyle{ \sigma _{\tau } = \frac{1} {\sqrt{8\pi ^{2}}}\,\left (\frac{T_{S}} {T_{\!A}}\right ) \frac{1} {\sqrt{\varDelta \nu \tau }\varDelta \nu _{\mathrm{rms}}}\;, }$$

(12.24)

where σ _f and σ _τ are the rms errors in fringe frequency and delay, T _S and T _A are the system and antenna temperatures, Δ ν is the IF bandwidth, τ is the integration time, and Δ ν _rms is the rms bandwidth introduced in Sect. 9.8 [see also Eqs. (A12.32) and related text in Appendix Appendix 12.1]. Δ ν _rms is typically 40% of the spanned bandwidth. For a single rectangular RF band, $\varDelta \nu _{\mathrm{rms}} =\varDelta \nu /\sqrt{12}$. To express the measurement error as an angle, recall that the geometric delay is

$$\displaystyle{ \tau _{g} = \frac{D} {c} \cos \theta \;, }$$

(12.25)

where θ is the angle between the source vector and the baseline vector. Thus, the sensitivity of the delay to angular changes is

$$\displaystyle{ \frac{\varDelta \tau _{g}} {\varDelta \theta _{\tau }} = \frac{D} {c} \sin \theta \;, }$$

(12.26)

where Δ θ _τ is the increment in θ corresponding to an increment Δ τ _g in τ _g. Similarly, the sensitivity of the fringe frequency to angular changes [since ν _f = ν(d τ _g∕dt)] is (for an east–west baseline)

$$\displaystyle{ \frac{\varDelta \nu _{f}} {\varDelta \theta _{f}} = D_{\lambda }\omega _{e}\!\sin \theta \;, }$$

(12.27)

where Δ θ _f is the increment in θ corresponding to an increment Δ ν _f in ν _f. Thus, by setting Δ ν _f = σ _f and Δ τ _g = σ _τ and ignoring geometric factors, we obtain the equation

$$\displaystyle{ \frac{\varDelta \theta _{\tau }} {\varDelta \theta _{f}} \simeq 2\pi \frac{\tau /t_{e}} {\varDelta \nu /\nu } \;, }$$

(12.28)

where t _e = 2π∕ω _e is the period of the Earth’s rotation. Equation (12.28) describes the relative precision of delay and fringe-frequency measurements. In practice, measurements of delay are generally more accurate because of the noise imposed by the atmosphere. Measurements of fringe frequency are sensitive to the time derivative of atmospheric path length, and in a turbulent atmosphere, this derivative can be large, while the average path length is relatively constant. Note that fringe-frequency and delay measurements are complementary. For example, with a VLBI system of known baseline and instrumental parameters, the position of a source can be found from a single observation using the delay and fringe frequency because these quantities constrain the source position in approximately orthogonal directions. The earliest analyses of fringe-frequency and delay measurements to determine source positions and baselines were made by Cohen and Shaffer (1971) and Hinteregger et al. (1972).

The accuracy with which group delay can be used to measure a source position is proportional to the reciprocal of the bandwidth 1∕Δ ν. Similarly, the accuracy with which phase can be used to measure a source position is proportional to the reciprocal of the observing frequency 1∕ν. Since the proportionality constants are approximately the same, the relative accuracy of these techniques is ν∕Δ ν. This ratio of the observing frequency to the bandwidth, including effects of bandwidth synthesis, is commonly one to two orders of magnitude. On the other hand, the antenna spacings used in VLBI are one to two orders of magnitude greater than those used in connected-element systems. Thus, the accuracy of source positions estimated from group delay measurements with VLBI systems is comparable to the accuracy of those estimated from fringe phase measurements on connected-element systems having much shorter baselines. VLBI position measurements using phase referencing, as described below, are the most accurate of radio methods.

The ultimate limitations on ground-based interferometry are imposed by the atmosphere. Dual-frequency-band measurements effectively remove ionospheric phase noise (see Sect. 14.1.3). The rms phase noise of the troposphere increases about as d ^5∕6, where d is the projected baseline length, for baselines shorter than a few kilometers [see Eq. (13.101) and Table 13.3]. In this regime, measurement accuracies of angles improve only slowly with increasing baseline length. For baselines greater than ∼ 100 km, the effects of the troposphere above the interferometer elements are uncorrelated, and the measurement accuracy might be expected to improve more rapidly with baseline length. However, for widely spaced elements, the zenith angle can be significantly different, and the atmospheric model becomes very important.

12.2.3 Phase Referencing (Position)

In VLBI measurements of relative positions of closely spaced sources, it is possible to measure the relative fringe phases and thus obtain positional accuracy corresponding to the very high angular resolution inherent in the long baselines. The most accurate measurements can be made when the sources are sufficiently close that both fall within the antenna beams [see, e.g., Marcaide and Shapiro (1983) and Rioja et al. (1997)] or when they are no more than a few degrees apart so that tropospheric and ionospheric effects are closely matched (Shapiro et al. 1979; Bartel et al. 1984; Ros et al. 1999). In such cases, one source can be used as a calibrator in a manner similar to that for phase calibration in connected-element arrays. In VLBI, this procedure is referred to as phase referencing. It allows imaging of sources for which the flux densities are too low to permit satisfactory self-calibration. The description here follows reviews of phase referencing procedures by Alef (1989) and Beasley and Conway (1995).

In phase referencing observations, measurements are made alternately on the target source and on a nearby calibrator, with periods on the order of a minute on each. (Note that the calibrator is also referred to as the phase reference source.) The rate of change of phase during these measurements must be slow enough that, from one calibrator measurement to the next, it is possible to interpolate the phase without ambiguity factors of 2π. It is therefore necessary to use careful modeling to remove geodetic and atmospheric effects, including tectonic plate motions, polar motion, Earth tides, and ocean loading, and to make precise corrections for precession and nutation on the source positions. More subtle effects may need to be taken into account; for example, gravitational distortions of antenna structures, which tend to cancel out in connected-element arrays, can affect VLBI baselines because of the difference in elevation angles at widely spaced locations. Phase referencing has become more useful as better models for these effects, together with increased sensitivity and phase stability of receiving systems, have been developed.

Consider the case in which we observe the calibration source at time t ₁, then the target source at time t ₂, and then the calibrator again at time t ₃. For any one of these observations, the measured phase is

$$\displaystyle{ \phi _{\mathrm{meas}} =\phi _{\mathrm{vis}} +\phi _{\mathrm{inst}} +\phi _{\mathrm{pos}} +\phi _{\mathrm{ant}} +\phi _{\mathrm{atmos}} +\phi _{\mathrm{ionos}}\;, }$$

(12.29)

where the terms on the right side are, respectively, the components of the phase due to the source visibility, instrumental effects (cables, clock errors, etc.), the error in the assumed source position, errors in assumed antenna positions, the effect of the neutral atmosphere, and the effect of the ionosphere. To correct the phase of the target source, we need to interpolate the measurements on the calibrator at t ₁ and t ₃ to estimate what the calibrator phase would have been if measured at t ₂, and then subtract the interpolated phase from the measured phase for the target source. If the positions of the target source and the calibrator are sufficiently close on the sky (not more than a few degrees apart), lines of sight from any antenna to the two sources will pass through the same isoplanatic patch, so the differences in the atmospheric and ionospheric terms can be neglected. We can assume that the instrumental terms do not differ significantly with small position changes, and if the calibrator is unresolved, then its visibility phase is zero. If the calibrator is partially resolved, it should be strong enough to allow imaging by self-calibration, and correction can be made for its phase. Thus, the corrected phase of the target source reduces to

$$\displaystyle{ \phi ^{t} -\tilde{\phi }^{c} =\phi _{ \mathrm{ vis}}^{t} + \left (\phi _{\mathrm{ pos}}^{t} -\tilde{\phi }_{\mathrm{ pos}}^{c}\right )\;, }$$

(12.30)

where the superscripts t and c refer to the target source and calibrator, respectively, and the tilde indicates interpolated values. The right side of Eq. (12.30) depends only on the structure and position of the target source, and the position of the calibrator. Figure 12.1 shows an example of phase referencing in which fringe fitting was performed on the data for the reference source, that is, determination of baseline errors, offsets between time standards at the sites, and instrumental phases. The results for the phase reference source (calibrator) are shown as crosses, and the resulting phase and phase rate corrections were interpolated to the times of the data points for the target source, shown as open squares. The corrected phases for the target source are shown in the lower diagram. For fringe fitting, it is desirable to have a source that is unresolved and provides a strong signal, so a phase reference source should be chosen for these characteristics when the target source is weak or resolved.

Of the various effects in Eq. (12.29) that are removed by phase referencing, those that vary most rapidly with time are the atmospheric ones, and at frequencies above a few gigahertz, they result primarily from the troposphere rather than the ionosphere. Thus, at centimeter wavelengths, the tropospheric variations limit the time that can be allowed for each cycle of observation of the target and calibrator sources. Variations resulting from a moving-screen model of the troposphere are described in Section 13.1.6; the characteristics of the screen are based on Kolmogorov turbulence theory (Tatarski 1961). The relative rms variation in phase for the target and calibrator sources, the rays from which pass through the atmosphere a distance d _tc apart, is proportional to d _tc ^5∕6:

$$\displaystyle{ \sigma =\sigma _{0}\,d_{tc}^{5/6}\;, }$$

(12.31)

where σ ₀ is the phase variation for 1-km ray spacing. In order to be able to interpolate the VLBI phase reference values from one calibrator observation to the next without ambiguity in the number of turns, the rms path length should not change by more than ∼ λ∕8 between successive calibrator scans. Then if the scattering screen moves horizontally with velocity v _s, the criterion above results in a limit on the time for one cycle of the target source and calibrator, t _cyc. To determine this limit, we put d _tc = v _s t _cyc, and from Eq. (12.31) obtain

$$\displaystyle{ t_{\mathrm{cyc}} < \left ( \frac{\pi } {8\sigma _{0}}\right )^{6/5}v_{ s}^{-1}\;. }$$

(12.32)

This result can be used to illustrate the time limit on the switching cycle. The empirical data in Table 13.4 show that at λ = 6 cm (5 GHz frequency), the typical rms delay path is about 1 mm for d _tc = 1 km, at the VLA site. The corresponding value of σ ₀ for 6-cm wavelength is 6^∘, and for v _s = 0. 01 km s⁻¹, t _cyc < 19 min. This result is for typical conditions at the VLA site. For the same location and 1-km ray spacing, but under conditions described as “very turbulent,” Sramek (1990) gives a value of 7.5 mm for the rms path deviation. The value of σ ₀ for 6-cm wavelength is then 45^∘, resulting in t _cyc < 1. 7 min. The elevation angle of the source was not less than 60^∘ for this last observation, so even shorter switching times could apply at lower elevation angles. Specific recommendations for cycle times in VLBI applications are given by Ulvestad (1999).

At frequencies below ∼ 1 GHz, the ionosphere becomes the limiting factor and medium-scale traveling ionospheric disturbances (MSTIDs), which have velocities of 100–300 m s⁻¹ and wavelengths up to several hundred kilometers, become important (Hocke and Schlegel 1996). Phase fluctuations resulting from the ionosphere or troposphere are minimized in the approximate range 5–15 GHz, in which good performance can be obtained by phase referencing in VLBI.

There are also limits on the angular range that should be used in switching to the phase reference source, since even with a static atmosphere phase, errors are introduced that increase with switching angle. Phase referencing over 3^∘ with 50-μas precision has been demonstrated by Reid et al. (2009) and Reid and Honma (2014).

The offsets and uncertainties in the geometric parameters of the interferometer cause residual errors that scale in proportion to the angular separation between the target source and the reference or calibrator source. To first order, an offset in the position of the calibrator source is simply transferred to the estimate of the position of the target source. This is because the (u, v) coordinates of the target and calibrator sources can be considered to be the same for a separation of a few degrees or less. However, second-order corrections can be important. The total fringe phase [see Eq. (12.1)] is 2π D _λcosθ. For a target source in the direction θ _t and a calibrator source in direction θ _c, the difference in these interferometer phases will be

$$\displaystyle{ \varDelta \phi =\phi _{t} -\phi _{c} = 2\pi D_{\lambda }(\cos \theta _{t} -\cos \theta _{c})\;. }$$

(12.33)

Since cos(θ _c) ≃ cosθ _t − sinθ _t(θ _c −θ _t),

$$\displaystyle{ \varDelta \phi \simeq 2\pi D_{\lambda }\sin \theta _{t}\,\theta _{\mathrm{sep}}\;, }$$

(12.34)

where θ _sep = θ _t −θ _c.

Now we include the effect of a change in phase for an error in the baseline of Δ D _λ, giving a second-order phase term of

$$\displaystyle{ \varDelta ^{2}\phi \simeq 2\pi \varDelta D_{\lambda }\sin \theta _{ t}\,\theta _{\mathrm{sep}}\;, }$$

(12.35)

or, ignoring the geometric factor,

$$\displaystyle{ \varDelta ^{2}\phi \simeq 2\pi \varDelta D_{\lambda }\theta _{\mathrm{ sep}}\;. }$$

(12.36)

This is the phase that affects astrometric accuracy. Hence, the effect of the baseline error on the phase is reduced by the factor θ _sep. The phase shift has an equivalent position offset of

$$\displaystyle{ \varDelta \theta \simeq \frac{\varDelta ^{2}\phi } {2\pi D_{\lambda }} \simeq \frac{\varDelta D} {D}\,\theta _{\mathrm{sep}}\;. }$$

(12.37)

With D = 8000 km, λ = 4 cm, and θ _sep = 1 degree (0.017 radians), D _λ = 10⁸, and the resolution is θ _s = 1∕D _λ = 1 mas. An error in the baseline of 2 cm would cause a phase error of about 3 degrees, corresponding to an angle of 9 μas.

Equation (12.37) provides an excellent rule of thumb for astrometric accuracy, and Δ D∕D can be interpreted as a rotational angular error in the baseline, or Δ D can be replaced by c Δ τ, where Δ τ might characterize the atmospheric delay error.

Similarly, if there is an error in the calibrator position of Δ θ _c, then, from Eq. (12.34), there will be an error in the phase of

$$\displaystyle{ \varDelta ^{2}\phi \simeq 2\pi D_{\lambda }\cos \theta _{ t}\,\varDelta \theta _{c}\,\varDelta \theta _{\mathrm{sep}}\;, }$$

(12.38)

where we assume sinθ _t ≃ sinθ _c. Again, ignoring the trigonometric factor, we can write

$$\displaystyle{ \varDelta ^{2}\phi \simeq 2\pi D_{\lambda }\varDelta \theta _{ c}\,\theta _{\mathrm{sep}} \simeq 2\pi \frac{\varDelta \theta _{c}\,\theta _{\mathrm{sep}}} {\theta _{\mathrm{res}}} \;. }$$

(12.39)

If astrometry is done in the image plane with an array, then there will be a variety of phase errors of magnitude equal to Eq. (12.38) but differing by the trigonometric factors of the various array baselines and can be thought of as the rms phase noise. Then the image will be substantially degraded when Δ ² ϕ ∼ 1. To meet this criterion with D = 8000 km, λ = 4 cm, and θ _sep = 1 degree, the calibrator position must be known to about 10 mas. With an rms phase error of 1 radian, the visibility of a point source is reduced by a factor of exp(−2ϕ ²∕2) ∼ 0. 6. For 2 radians of phase error, the image would be destroyed. The equivalent angular offset for the phase shift given in Eq. (12.39) is

$$\displaystyle{ \varDelta \theta \simeq \varDelta \theta _{c}\,\theta _{\mathrm{sep}}\;. }$$

(12.40)

Note that Δ θ _c plays the same role as Δ D∕D in Eq. (12.37). Hence, an astrometric accuracy of 10 μas requires position error in the calibrator of 150 μas or less. An analysis in the (u, v) plane is given in Appendix 12.2.

12.2.4 Phase Referencing (Frequency)

At millimeter wavelengths, position phase referencing becomes more difficult because calibration sources are generally weaker and more sparsely distributed than at longer wavelengths. Coherence times are also shorter, e.g., a few tens of seconds above 100 GHz, requiring rapid antenna pointing changes for calibration by position switching. In this case, frequency switching on the target source itself is a valuable calibration technique. The goal is to remove effects in the fringe phase that scale as frequency, i.e., that can be characterized by a nondispersive or constant delay. ϕ _c, the phase at the lower frequency (ν _c), is used to calibrate the phase at the higher frequency (ν _t), ϕ _t by forming the quantity

$$\displaystyle{ \phi =\phi _{t} - R\phi _{c}\;, }$$

(12.41)

where R is the ratio of frequencies ν _t∕ν _c. This procedure will remove the effects of the atmosphere and the frequency standards but not the ionosphere or other dispersive processes. Note that at low frequencies, the goal of dual frequency calibration is to remove the ionospheric delay, and R in Eq. (12.41) is replaced by 1∕R (e.g., see Sects. 12.6 and 14.1.3). It is usually convenient to choose R to be an exact integer to avoid the need to deal with phase wrap issues. To see this, focus on the term that describes the tropospheric excess path length L. For this exercise, ϕ _c = 2π ν _c L∕c + 2π n _c and ϕ _t = 2π ν _t L∕c + 2π n _t, where n _c and n _t are the integers that characterize the phase wraps. The calibrated phase is thus

$$\displaystyle{ \phi = 2\pi \,(n_{c} - Rn_{t})\;. }$$

(12.42)

which will be an integral multiple of 2π if R is an integer. An early demonstration of this technique was carried out by Middelberg et al. (2005), who used phases at 14.375 GHz to calibrate those at 86.25 GHz (R = 6). The residual error phase caused by the ionosphere and electronic drifts in local oscillator chains have a much longer timescale than tropospheric variations. These can be removed by adding a slower position switching cycle. The efficacy of this double switching technique has been demonstrated by Rioja and Dodson (2011), Rioja et al. (2014, 2015), and Jung et al. (2011). If the source structure consists of a compact core at both frequencies, as in many AGN, the shift in position with frequency caused by opacity effects in the source can be an important physical diagnostic. This shift can be accurately measured by application of frequency/position switching calibration.

12.3 Time and Motion of the Earth

We now consider the effect of changes in the magnitude and direction of the Earth’s rotation vector on interferometric measurements. These changes cause variations in the apparent celestial coordinates of sources, the baseline vectors of the antennas, and universal time. The variations of the Earth’s rotation can be divided into three categories:

1.
There are variations in the direction of the rotation axis, resulting mainly from precession and nutation of the spinning body. Since the direction of the axis defines the location of the pole of the celestial coordinate system, the result is a variation in the right ascension and declination of celestial objects.
2.
The axis of rotation varies slightly with respect to the Earth’s surface; that is, the positions on the Earth at which this axis intersects the Earth’s surface vary. This effect is known as polar motion. Since the (X, Y, Z) coordinate system of baseline specification introduced in Sect. 4.1 takes the direction of the Earth’s axis as the Z axis, polar motion results in a variation of the measured baseline vectors (but not of the baseline length). It also results in a variation in universal time.
3.
The rate of rotation varies as a result of atmospheric and crustal effects, and this again results in variation in universal time.

We briefly discuss these effects. Detailed discussions from a geophysical viewpoint can be found in Lambeck (1980).

12.3.1 Precession and Nutation

The gravitational effects of the Sun, Moon, and planets on the nonspherical Earth produce a variety of perturbations in its orbital and rotational motions. To take account of these effects, it is necessary to know the resulting variation of the ecliptic, which is defined by the plane of the Earth’s orbit, as well as the variation of the celestial equator, which is defined by the rotational motion of the Earth. The gravitational effects of the Sun and Moon on the equatorial bulge (quadrupole moment) of the Earth result in a precessional motion of the Earth’s axis around the pole of the ecliptic.

The Earth’s rotation vector is inclined at about 23.5^∘ to the pole of its orbital plane, the ecliptic. The period of the resulting precession is approximately 26,000 years, corresponding to a motion of the rotation vector of 20^{′ ′} per year [2πsin(23. 5^∘)∕26, 000 radians per year]. The 23. 5^∘ obliquity is not constant but is currently decreasing at a rate of 47^{′ ′} per century, due to the effect of the planets, which also cause a further component of precession. The lunisolar and planetary precessional effects, together with a smaller relativistic precession, are known as the general precession. Precession results in the motion of the line of intersection of the ecliptic and celestial equator. This line, called the line of nodes, defines the equinoxes and the zero of right ascension, which precess at a rate of 50^{′ ′} per year. In addition, the time-varying lunisolar gravitation effects cause nutation of the Earth’s axis with periods of up to 18.6 years and a total amplitude of about 9^{′ ′}. The principal variations of the ecliptic and equator are those just described, but other smaller effects also occur. The general accuracy within which positional variations can be calculated is better than 1 mas (Herring et al. 1985). Expressions for precession can be found in Lieske et al. (1977) and for nutation in Wahr (1981). The required procedures are discussed in texts on spherical astronomy, such as Woolard and Clemence (1966), Taff (1981), and Seidelmann (1992).

Since precession and nutation result in variations in celestial coordinates that can be as large as 50^{′ ′} per year for objects at low declinations, these effects must be taken into account in almost all observational work, whether astrometric or not. Positions of objects in astronomical catalogs are therefore reduced to the coordinates of standard epochs, B1900.0, B1950.0, or J2000.0. These dates denote the beginning of a Besselian year or Julian year, as indicated by the B or J. The positions correspond to the mean equator and equinox for the specified epoch, where “mean” indicates the positions of the equator and equinox resulting from the general precession, but not including nutation. For further explanation and a discussion of a method of conversion between standard epochs, see Seidelmann (1992). Correction is also required for aberration, that is, for the apparent shift in position resulting from the finite velocity of light and the motion of the observer. Two components are involved: annual aberration resulting from the Earth’s orbital motions, which has a maximum value of about 20^{′ ′}; and diurnal aberration resulting from the rotational motion, which has a maximum value of 0. 3^{′ ′}. The retarded baseline concept (Sect. 9.3) used in VLBI data reduction accounts for the diurnal aberration. For the nearer stars, corrections for proper motion (i.e., actual motion of the star through space) are required and in some cases also for the parallax resulting from the changing position of the Earth in its orbit (see Sect. 12.5). The impact of radio techniques, particularly VLBI, is resulting in refinement of the classical expressions and parameters. Effects such as the deflection of electromagnetic waves in the Sun’s gravitational field must also be included in positional work of the highest accuracy (see Sect. 12.6).

12.3.2 Polar Motion

The term polar motion denotes the variation of the pole of rotation of the Earth (the geographic pole) with respect to the Earth’s crust. This results in a component of motion of the celestial pole that is distinct from precessional and other motions. Polar motion is largely, but not totally, of geophysical origin. The motion of the geographic pole around the pole of the Earth’s figure is irregular, but over the last century, the distance between these two poles wandered by up to 0. 5^{′ ′}, or 15 m on the Earth’s surface. In a year’s time, the excursion of the figure axis is typically 6 m or less. The motion can be analyzed into several components, some regular and some highly irregular, and not all are understood. The two major components have periods of 12 and 14 months. The 12-month component is a forced motion due to the annual redistribution of water and of atmospheric angular momentum and is far from any resonance. The 14-month component, known as the Chandler wobble (Chandler 1891), is the motion at a resonance frequency whose driving force is unknown. For a more detailed description, see Wahr (1996).

The motion of the pole of rotation is measured in angle or distance in the x and y directions, as shown in Fig. 12.2. The (x, y) origin is the mean pole of 1900–1905, which is referred to as the conventional international origin (CIO), and the x axis is in the plane of the Greenwich meridian (Markowitz and Guinot 1968). Since polar motion is a small angular effect, it can often be ignored in imaging observations, especially if the visibility is measured with respect to a calibrator that is only a few degrees from the center of the field being imaged.

12.3.3 Universal Time

Like the motion of the Earth, the system of timekeeping based on Earth rotation is a complicated subject, and for a detailed discussion, one can refer to Smith (1972) or to the texts mentioned in the discussion of precession and nutation above. We shall briefly review some essentials. Solar time is defined in terms of the rotation of the Earth with respect to the Sun. In practice, the stars present more convenient objects for measurement, so solar time is derived from measurement of the sidereal rotation. The positions of stars or radio sources used for such measurements are adjusted for precession, nutation, and so on, and the resulting time measurements thus depend only on the angular velocity of the Earth and on polar motion. When converted to the solar timescale, these measurements provide a form of universal time (UT) known as UT0; this is not truly “universal” since the effects of polar motion, which can amount to about 35 ms, depend on the location of the observatory. When UT0 is corrected for polar motion, the result is known as UT1. Since it is a measure of the rotation of the Earth relative to fixed celestial objects, UT1 is the form of time required in astronomical observing, including the analysis of interferometric observations, navigation, and surveying. However, UT1 contains the effects of small variations in the Earth’s rotation rate, attributable largely to geophysical effects such as the seasonal variations in the distribution of water between the surface and atmosphere. Fluctuations in the length of day over the period of a year are typically about 1 ms. To provide a more uniform measure of time, UT2 is derived from UT1 by attempting to remove seasonal variations. UT2 is rarely used. UT1 and UT2 include the effect of the gradual decrease of the rotation rate of the Earth. This causes the length of the UT1/UT2 day to increase slightly when compared with International Atomic Time (IAT), which is based on the frequency of the cesium line (see Sect. 9.5.4). The IAT second is the basis for another form of UT, Coordinated Universal Time (UTC), which is offset from IAT so that | UT1 −UTC | < 1 s. This relationship is maintained by inserting one-second discontinuities (leap seconds) in UTC when required on specified days of the year.

The practice at many observatories is to maintain UTC or IAT using an atomic standard and then obtain UT1 from the published values of ΔUT1 = UT1 – UTC. Since ΔUT1 is measured rather than computed, in principle it can be determined only after the fact. However, it is possible to predict it by extrapolation with satisfactory accuracy for periods of one or two weeks and thus to implement UT1 in real time. Values of ΔUT1 are available from the Bureau International de L’Heure (BLH), which was established in 1912 at the Paris Observatory to coordinate international timekeeping, and from the U.S. Naval Observatory. Rapid service data are available from these institutions with a timeliness suitable for extrapolation.

12.3.4 Measurement of Polar Motion and UT1

The classical optical methods of measuring polar motion and UT1 are by timing the meridian transits of stars of known positions. Observations at different longitudes, using stars at more than one declination, are required to determine all three parameters (x, y, ΔUT1). During the 1970s, it became evident that such astrometric tasks can also be performed by radio interferometry (McCarthy and Pilkington 1979).

To specify the baseline components of an interferometer for such measurements, we use the (X, Y, Z) system of Sect. 4.1, rotated so that the X axis lies in the Greenwich meridian instead of the local meridian. Let Δ X, Δ Y, and Δ Z be the changes in the baseline components resulting from polar motion (x, y) (in radians) and a time variation (UT1 – UTC) corresponding to Θ radians. Then we may write

$$\displaystyle{ \left [\begin{array}{*{10}c} \varDelta X\\ \varDelta Y \\ \varDelta Z\\ \end{array} \right ] = \left [\begin{array}{*{10}c} 0 &-\varTheta & x\\ \varTheta & 0 &-y \\ -x& y & 0\\ \end{array} \right ]\left [\begin{array}{*{10}c} X\\ Y \\ Z\\ \end{array} \right ]\;, }$$

(12.43)

where the square matrix is a three-dimensional rotational matrix valid for small angles of rotation. Θ, x, and y are the rotation angles about the Z, Y, and X axes, respectively. From Eq. (12.43), we obtain

$$\displaystyle\begin{array}{rcl} \varDelta X& =& -\varTheta Y + xZ\;, \\ \varDelta Y & =& \ \varTheta X - yZ\;, \\ \varDelta Z& =& -xX + yY \;.{}\end{array}$$

(12.44)

Thus, if one observes a series of sources at periodic intervals and determines the variation in baseline parameters, Eqs. (12.44) can be used to determine UT1 and polar motion. For an interferometer with an east–west baseline (Z = 0), one can determine Θ but cannot separate the effects of x and y. An east–west interferometer located on the Greenwich meridian (X = Z = 0) would yield measures of Θ and y but not of x. If it had a north–south component of baseline (Z ≠ 0), one could still measure y but would not be able to separate the effects of x and Θ. In general, one cannot measure all three quantities with a single baseline, since a single direction is specified by two parameters only. Systems suitable for a complete solution might be, for example, two east–west interferometers separated by about 90^∘ in longitude or a three-element noncollinear interferometer. An example of VLBI measurements of the pole position is shown in Fig. 12.3. The Global Positioning System provides a method of making pole-position measurements [see, e.g., Herring (1999)].

The methods just described are applicable to observations using connected-element interferometers in which the phase can be calibrated, and also to VLBI observations in which the bandwidth is sufficient to obtain accurate group delay measurements. An example of VLBI determination of the length of day is shown in Fig. 12.4. The data show an annual variation of about 2 ms, which is caused by the angular momentum exchange between the Earth and the atmosphere due to the difference in land mass in the Northern and Southern Hemispheres [see, e.g., Paek and Huang (2012)]. The trend in the long-term variation is thought to be due to an exchange of angular momentum between the Earth’s core and mantle. The effects of El Niño events can be seen in these data (Gipson and Ma 1999). A comparison of determinations of UT1 and polar motion by VLBI, satellite laser ranging, and BLH analyses of standard astrometric data is given in Robertson et al. (1983) and Carter et al. (1984).

VLBI is a unique tool for the study of many phenomena related to Earth dynamics. For example, the period and amplitude of the free-core nutation has been estimated (Krásná et al. 2013).

12.4 Geodetic Measurements

Certain geophysical phenomena, for example, Earth tides (Melchior 1978) and movements of tectonic plates, can result in variations in the baseline vector of a VLBI system. Variations in the length of the baseline are clearly attributable to such phenomena, whereas variations in the direction can also result from polar motion and rotational variations. Magnitudes of the effects are of order 1–10 cm per year for plate motions and 30 cm (diurnal) for Earth tides. They are thus measurable using the techniques of VLBI. Solid-Earth tides were first detected by Shapiro et al. (1974), and refined measurements were reported by Herring et al. (1983). In addition to solid-Earth tides, displacement of land masses resulting from tidal shifts of water masses, called ocean loading, is measurable. The earliest evidence of contemporary motion of tectonic plates was found by Herring et al. (1986), who reported that the increase in the baseline between Westford, MA, and Onsala, Sweden, based on data from 1980 to 1984, was 17 ± 2 mm/yr. A plot of the extensive measurements of the Westford–Onsala baseline is shown in Fig. 12.5. For reviews of geodetic applications of VLBI, see Shapiro (1976), Counselman (1976), Clark et al. (1985), Carter and Robertson (1993), and Sovers et al. (1998).

12.5 Proper Motion and Parallax Measurements

The position of a relatively nearby star or radio source changes with respect to the distant background due to the annual motion of the Earth around the Sun. This effect is called annual parallax. It can be used to measure distances by the classical technique of trigonometric triangulation, first demonstrated by Bessel (1838) from optical observations of the star 61 Cygni. The parallax angle, Π, is defined as one-half of the total excursion in apparent position over a year. The distance to the object, by simple trigonometry for small angles, is

$$\displaystyle{ D = \frac{1} {\varPi } \;. }$$

(12.45)

By definition, an object with a parallactic angle of 1″ has a distance of 1 parsec. Hence, a parsec is 206265 (the number of arcseconds in a radian) times the Sun–Earth distance [called the astronomical unit (AU)], or 3. 1 × 10¹⁸ cm. The AU, which is determined by ranging measurements of the planets and spacecraft, is called the first rung on the cosmic distance ladder . Its value is 1. 4959787070000 × 10¹³ cm, to an accuracy of about 1 part in 50 billion (Pitjeva and Standish 2009). The intrinsic motion of nearby objects can also be measured. This is called proper motion. The precision of VLBI astronometric measurements has greatly extended the distances over which proper motions and parallaxes can be measured. If the parallax accuracy is σ _Π, the uncertainty in the distance can be determined from the differential of Eq. (12.45), i.e., Δ D = Π ⁻² Δ Π, to be σ _D = D ² σ _Π. Hence, the fractional distance accuracy is

$$\displaystyle{ \left ( \frac{\sigma _{D}} {D}\right ) = D\sigma _{\varPi }\;. }$$

(12.46)

The important result here is that the fractional distance accuracy grows with distance for a fixed positional accuracy. Hence, a fractional distance of 10% accuracy can be measured for objects to a distance of 10 pc with σ _Π = 0. 01″ (ground-based optical), or 100 pc with σ _Π = 1 mas (Hipparcos satellite), and 10⁴ pc with σ _Π = 10 μas (VLBI).

For measurements of parallax with better than 10% accuracy, σ _Π∕Π < 0. 1, the distance estimate of $D = \frac{1} {\varPi } \pm \frac{\sigma _{\varPi }} {\varPi ^{2}}$ is essentially unbiased. The situation is more complex when the accuracy is lower. If the probability distribution for a parallax measurements is

$$\displaystyle{ p\,(\varPi ) = \frac{1} {\sqrt{2\pi }\sigma _{\varPi }}\;e^{-\frac{\left (\varPi -\varPi _{0}\right )^{2}} {2\sigma _{\varPi }^{2}} }\;, }$$

(12.47)

where Π ₀ is the true, but unknown, parallax, then the probability distribution function of D, $p\,(D) = p\,(\varPi )\bigg\vert \frac{d\varPi } {dD}\bigg\vert $, is

$$\displaystyle{ p\,(D) = \frac{1} {\sqrt{2\pi }\sigma _{\varPi }} \frac{1} {D^{2}}\;e^{-\frac{\left ( \frac{1} {D}-\varPi _{0}\right )^{2}} {2\sigma _{\varPi }^{2}} }\;. }$$

(12.48)

p (D) becomes increasingly asymmetric with increasing σ _Π∕Π and develops a long tail at large values of D. The expectation of D, i.e., $\frac{1} {\varPi }$, can be calculated from Eq. (12.47) by Taylor expansion of $D = \frac{1} {\varPi }$, which gives the result

$$\displaystyle{ \langle D\rangle \simeq \frac{1} {\varPi _{0}} \left [1 + \frac{\sigma _{\varPi }^{2}} {\varPi _{0}^{2}}\right ]\;. }$$

(12.49)

For the case of a single source, an accepted strategy is to perform a Markov chain Monte Carlo (MCMC) analysis of the position-vs.-time data with D as a parameter and apply an appropriate prior distribution of D to estimate the final distribution, p (D). The difficulties of parallax analysis at low signal-to-noise ratio, including the Lutz–Kelker effect (Lutz and Kelker 1973), are discussed by Bailer-Jones (2015) and Verbiest and Lorimer (2014).

Parallaxes have been measured to many pulsars (Verbiest et al. 2010, 2012). These may be compared with indirect estimates based on dispersion measures and galactic models of electron density. Precision parallax distances may prove to be important in the use of pulsar timing measurements to detect gravitational radiation [see Madison et al. (2013)]. The distance to the pulsar PSR J2222-0137 has been determined to be 267. 3_−0. 9 ^+1. 2 pc, an accuracy of 0.4% (Deller et al. 2013).

The star IM Peg, which has detectable radio emission, provides an example of a precise measurement of parallax with VLBI (Bartel et al. 2015). Its position was precisely determined over 39 epochs spanning six years so that it could be used as a guide star for the physics experiment Gravity Probe B (Everitt et al. 2011). The position of the radio star is shown in Fig. 12.6. The position shift is dominated by the proper motion. The annual parallax can be readily seen when this proper motion, modeled as a constant velocity vector, is removed, as shown in Fig. 12.7.

An excellent example of the steady improvement in VLBI parallax measurements can be found in the work on the Orion Nebula , a galactic object of singular importance in astronomy. The results, shown in Table 12.1, made with a variety of continuum and spectral line sources over a considerable frequency range, have yielded a distance accurate to 1.5%. This corresponds to a parallactic accuracy [Eq. (12.46)] of ∼ 30 μas.

Table 12.1 VLBI parallax distance measurements to the Orion Nebula^a

Full size table

There are other notable examples of parallax measurements. The distance to the Pleiades Cluster was determined by VLBI to be 136. 2 ± 1. 2 pc by Melis et al. (2014), resolving a long-standing discrepancy in its distance estimates. VLBI has also been used to detect the apparent motion of Sgr A*, the radio source at the center of the Galaxy, against the extragalactic background caused by the rotation of the Galaxy. These results are shown in Fig. 12.8. A combination of these data and parallax measurements with the VLBA, VERA, and EVN of more than 100 masers gives Galactic structure parameters of R ₀ = 8. 34 ± 0. 16 pc and θ ₀ = 240 ± 8 km s⁻¹ (Reid et al. 2014).

12.6 Solar Gravitational Deflection

The bending of electromagnetic radiation passing a massive body is described in the parametrized post-Newtonian formalism of general relativity (GR) by the parameter γ and is normally written as [see, e.g., Misner et al. (1973), or Will (1993)]

$$\displaystyle{ \varDelta \epsilon = (1+\gamma )\frac{GM} {pc^{2}} \,(1+\cos \epsilon )\;. }$$

(12.50)

G is the gravitational constant. M is the mass of the perturbing body, which we take to be the Sun in this discussion; p is the impact parameter (closest approach of the unperturbed ray to the Sun); and ε is the elongation angle (the angle between the direction to the source and the direction to the Sun as seen by the observer). Equation (12.50) holds for sources at infinite distance. This parametrization reflects the fact that the bending predicted by Newtonian physics is exactly half the value predicted by GR, i.e., γ = 1 for GR and 0 for Newtonian physics. GM∕c ² is known as the gravitational radius, which is 1.48 km for the Sun. For a ray path passing close to the Sun where ε ≪ 1, Eq. (12.50) can be approximated as

$$\displaystyle{ \varDelta \epsilon = (1+\gamma )\frac{2GM} {pc^{2}} \;. }$$

(12.51)

For a ray grazing the surface of the Sun, where p = r ₀ (corresponding to ε = 0. 267^∘) and r ₀ is the solar radius, the deflection angle is 1. 75″.

Equation (12.50) can be rewritten so as to eliminate p, since p = R ₀sinε, where R ₀ is the distance from the Sun to the Earth. After some trigonometric manipulation, the deflection angle can be expressed as

$$\displaystyle{ \varDelta \epsilon = (1+\gamma ) \frac{GM} {R_{0}c^{2}}\sqrt{\frac{1+\cos \epsilon } {1-\cos \epsilon }}\;. }$$

(12.52)

Δ ε declines monotonically with ε, as shown in Fig. 12.9, and for γ = 1 has a value of 4.07 mas at ε = 90^∘, 1 mas at ε = 150^∘, and 0 at ε = 180^∘. Furthermore, two sources separated by 1^∘ near ε = 90^∘ will suffer a 70-μas shift in their relative positions.

Shapiro (1967) first suggested that GR could be tested by observing the deflection of radio waves passing in the vicinity of the Sun. This is just the radio version of the famous optical experiment first performed in 1919 by the Eddington expedition (Dyson et al. 1920). For a long time, the radio astronomical experiments were based on the two sources 3C279 and 3C273, which are separated by about 10 degrees and pass fortuitously close in angle to the Sun each October. In fact, 3C279 is occulted by the Sun on October 8. The measurement of the change in relative position of these sources can be used to estimate γ. The challenge of such measurements is to overcome the effects of the ionized plasmas surrounding the Sun, i.e., the corona and solar wind, whose effects diminish with distance from the Sun and as λ ² (see Sect. 14.3.1). Note that the ray bending caused by the solar plasma has the opposite sign as caused by GR, i.e., plasma bending makes sources appear closer to the Sun in angle and GR makes them appear farther.

The first radio interferometry experiments were undertaken for the 1969 passage, one with two antennas forming an interferometer at the Owens Valley Radio Observatory at a frequency of 9.1 GHz and a baseline of 1.4 km, and the other with two antennas forming an ad hoc interferometer at the JPL Goldstone facility at a frequency of 2.4 GHz and a baseline of 21 km. The solar plasma was modeled by both groups as two power-law components with amplitude parameters estimated from the data (see Sect. 14.3.1). The results from both experiments confirmed GR to an accuracy of about 30%, with the JPL instrument’s advantage of longer baseline and higher resolution compensating for the OVRO instrument’s advantage of higher frequency. The experiment has been repeated many times with ever more sensitive equipment and more refined techniques, and the results are listed in Table 12.2. The first VLBI experiment was reported by Counselman et al. (1974) for an 845-km baseline between Haystack and NRAO. Each site employed two antennas so that two coherent interferometers were formed to track both sources simultaneously.

Table 12.2 Measurements of solar gravitational bending with radio interferometry

Full size table

Another major step forward was the use of dual-frequency observations. This allowed a phase (or delay) observable to be formed from the two phases ϕ ₁ and ϕ ₂ measured at frequencies ν ₁ and ν ₂,

$$\displaystyle{ \phi _{c} =\phi _{2} -\left (\frac{\nu _{1}} {\nu _{2}}\right )\phi _{1}\;, }$$

(12.53)

from which the dispersive effects of the solar plasma (and ionosphere) are largely removed (see Sect. 14.1.3). The best result so far for the targeted 3C279/3C273 experiment is γ = 0. 9998 ± 0. 0003 with the VLBA at 15, 23, and 43 GHz (Fomalont et al. 2009). It should be noted that this result was based largely on the 43-GHz results alone, where the plasma effects were greatly diminished. Various changes are expected to be made that will improve this experiment by a factor of four (i.e., to a fractional accuracy of better than 1 part in 10⁴).

In addition, the huge geodetic VLBI database has been used to estimate γ, giving the results listed in Table 12.2. The analysis described by Lambert and Le Poncin-Lafitte (2011) is based on 5,055 observing sessions (1979–2010) of 3,706 sources and 7 million delay measurements. The postfit delay residual is 23 picoseconds, and γ = 0. 9992 ± 0. 0001. The continual accrual of geodetic data will lead to better results in the future. The best measurement overall of γ to date, γ = 1. 000021 ± 0. 000023, was made by analyzing the delay residuals from tracking the Cassini spacecraft as it passed the Sun in 2002 (Bertotti et al. 2003).

12.7 Imaging Astronomical Masers

In the envelopes of many newly formed stars, and also those of highly evolved stars and the accretion disks of AGN, radio emission from molecules such as H₂O and OH is caused by a maser process. The frequency spectrum of the emission is often complicated, containing many spectral features or components caused by clouds of gas moving at different line-of-sight velocities. Maps of strong maser sources reveal hundreds of compact components with brightness temperatures approaching 10¹⁵ K, angular sizes as small as 10⁻⁴ arcsec, and flux densities as high as 10⁶ Jy. The components are typically distributed over an area of several arcseconds in diameter and a Doppler velocity range of 10–3000 km s⁻¹ (0.7–200 MHz for the H₂O maser transition at 22 GHz). Individual features have line widths of about 1 km s⁻¹ or less (74 kHz at 22 GHz). The physics and phenomenology of masers are discussed by Reid and Moran (1988); Elitzur (1992); and Gray (2012). The processing and analysis of maser data require large correlator systems because the ratio of required bandwidth to spectral resolution is large (10²–10⁴). They also require prodigious amounts of image processing because the ratio of the field of view to the spatial resolution is large (10²–10⁴). As an extreme example, the H₂O maser in W49 has hundreds of features distributed over 3 arcsec (Gwinn et al. 1992). The complete mapping of this source at a resolution of 10⁻³ arcsec with 3 pixels per resolution interval would require the production of 600 maps, each with at least 10⁸ pixels. However, most of the map cells would contain no emission. Thus, the usual procedure is to measure the positions of the features crudely by fringe-frequency analysis and then to map small fields around these locations by Fourier synthesis techniques. Examples of maps made by fringe-frequency analysis can be found in Walker et al. (1982); by phase analysis in Genzel et al. (1981) and Norris and Booth (1981); and by Fourier synthesis in Reid et al. (1980), Norris et al. (1982), and Boboltz et al. (1997). We shall briefly discuss some of the techniques used in mapping masers and their accuracies. Note that geometric (group) delays cannot be measured accurately because of the narrow bandwidths of the maser lines.

In mapping masers, we must explicitly consider the frequency dependence of the fringe visibility. We assume that a maser source consists of a number of point sources. Furthermore, we assume that the measurements are made with a VLBI system and that the desired RF band is converted to a single baseband channel. Adapting Eq. (9.28), we can write the residual fringe phase of one maser component at frequency ν as

$$\displaystyle{ \varDelta \phi (\nu ) = 2\pi \,\left [\nu \varDelta \tau _{g}(\nu ) + (\nu -\nu _{\mathrm{LO}})\tau _{e} +\nu \tau _{\mathrm{at}}\right ] +\phi _{\mathrm{in}} + 2\pi n\;, }$$

(12.54)

where τ _e is the relative delay error due to clock offsets; τ _at is the differential atmospheric delay; Δ τ _g(ν) is the difference between the true geometric delay of the source τ _g(ν) and the expected (reference) delay; ν _LO is the local oscillator frequency; ϕ _in is the instrumental phase, which includes the local oscillator frequency difference and can be a rapidly varying function of time; and 2π n represents the phase ambiguity. A frequency can usually be found that has only one unresolved maser component, and this component can then be used as a phase reference. The use of a phase reference feature is fundamental to all maser analysis procedures, and it allows maps of the relative positions of maser components to be made with high accuracy. The difference in residual fringe phase between a maser feature at frequency ν and the reference feature at frequency ν _R is

$$\displaystyle{ \varDelta ^{2}\phi (\nu ) =\varDelta \phi (\nu ) -\varDelta \phi (\nu _{ R})\;, }$$

(12.55)

which, with the use of Eq. (12.54), becomes

$$\displaystyle\begin{array}{rcl} \varDelta ^{2}\phi (\nu )& =& 2\pi {\Bigl \{\nu \Bigr.}\left [\tau _{ g}(\nu ) -\tau _{g}(\nu _{R})\right ] \\ & & +\left.(\nu -\nu _{R})\left [\tau _{g}(\nu _{R}) -\tau '_{g}(\nu _{R})\right ] + (\nu -\nu _{R}){\Bigl [\tau _{e} +\tau _{\mathrm{at}}\Bigr ]}\right \}\;,{}\end{array}$$

(12.56)

where τ′_g(ν _R) is the expected delay of the reference feature, and τ _g(ν _R) is the true delay. The frequency-independent terms ϕ _in and 2π n cancel in Eq. (12.56). However, there are residual terms in Eq. (12.56) that are proportional to the difference in frequency between the feature of interest and the reference feature. These terms arise because phases at different frequencies are differenced in Eq. (12.55). Following the notation of Eq. (12.7), which uses the convention Δ term = (assumed value) – (true value), we can write Eq. (12.56) as

$$\displaystyle\begin{array}{rcl} \varDelta ^{2}\phi (\nu )& =& \frac{2\pi \nu } {c}\mathbf{D}\,\boldsymbol{\cdot }\,\varDelta \mathbf{s}_{\nu R} -\frac{2\pi \nu } {c}\varDelta \mathbf{D}\,\boldsymbol{\cdot }\,\varDelta \mathbf{s}_{\nu R} \\ & & -\frac{2\pi } {c}\left [(\nu -\nu _{R})(\varDelta \mathbf{D}\,\boldsymbol{\cdot }\,\mathbf{s}_{R} + \mathbf{D}\,\boldsymbol{\cdot }\,\varDelta \mathbf{s}_{R})\right ] + 2\pi \,(\nu -\nu _{R})(\tau _{e} +\tau _{\mathrm{at}})\;,{}\end{array}$$

(12.57)

where D is the assumed baseline, Δ D is the baseline error, s _R is the assumed position of the reference feature, and Δ s _R is the corresponding position error. Δ s _{ν R} is the separation vector from the feature at frequency ν to the reference feature, and thus the true position of the feature at frequency ν is s _R −Δ s _R +Δ s _{ν R}.

The first term on the right side of Eq. (12.57) is the desired quantity from which the position of the feature relative to the reference feature can be determined, and the remaining terms describe the phase errors introduced by uncertainty in baseline, source position, clock offset, and atmospheric delay. These phase error terms can be converted approximately to angular errors by dividing them by c∕2π ν D. Thus, for example, an error of 0.3 m in a baseline component would cause a delay error of about 1 ns in the term $\varDelta \mathbf{D}\,\boldsymbol{\cdot }\,\mathbf{s}_{R}$ in Eq. (12.57) and a phase error of 10⁻³ turns for features separated by 1 MHz. This phase error corresponds to a nominal error of 10⁻⁶ arcsec on a baseline of 2500 km at 22 GHz, which provides a fringe spacing of 10⁻³ arcsec. Similarly, a clock or atmospheric error of 1 ns would cause the same positional error. The same baseline error also causes additional positional errors, through the $\varDelta \mathbf{D}\,\boldsymbol{\cdot }\,\varDelta \mathbf{s}_{\nu R}$ term, of 10⁻⁷ arcsec per arcsecond separation of the features. A detailed discussion of mapping errors caused by this calibration method can be found in Genzel et al. (1981).

Another method of calibrating the fringe phase is to scale the phase of the reference feature to the frequency of the feature to be calibrated. That is,

$$\displaystyle{ \varDelta ^{2}\phi (\nu ) =\varDelta \phi (\nu ) -\varDelta \phi (\nu _{ R}) \frac{\nu } {\nu _{R}}\;. }$$

(12.58)

This method of calibration is more accurate than the method of Eq. (12.55) because error terms proportional to ν −ν _R do not appear. However, there are additional terms involving the phase ambiguity and the instrumental phase. Thus, this calibration method is applicable only if the fringe phase can be followed carefully enough to avoid the introduction of phase ambiguities.

Maps of lower accuracy and sensitivity than those obtainable from phase data can be made with fringe-frequency data. Suppose that the interferometer is well calibrated. The differential fringe frequency, that is, the difference in fringe frequency between the feature at frequency ν and the reference feature, can then be written [using Eq. (12.20)]

$$\displaystyle{ \varDelta ^{2}\nu _{ f}(\nu ) \simeq \dot{u}\varDelta \alpha '(\nu ) + \dot{v}\varDelta \delta (\nu )\;, }$$

(12.59)

where $\dot{u}$ and $\dot{v}$ are the time derivatives of the projected baseline components, Δ α′(ν) and Δ δ(ν) are the coordinate offsets from the reference feature, and Δ α′(ν) = Δ α(ν)cosδ. The relative positions of the maser feature can then be found by fitting Eq. (12.59) to a series of fringe-frequency measurements at various hour angles. This technique was first employed by Moran et al. (1968) for the mapping of an OH maser. The errors in fringe-frequency measurements decrease as τ ^3∕2 [see Eq. (A12.27)], where τ is the length of an observation, but for large values of τ, the differential fringe frequency Δ ² ν _f is not constant, because $\ddot{u}$ and $\ddot{v}$ are not zero. Thus, there is a limited field of view available for accurate mapping with fringe-frequency measurements. This field of view can be estimated by equating the rms fringe-frequency error in Eq. (A12.27) with τ times the derivative of the differential fringe frequency with respect to time. Therefore, for an east–west baseline,

$$\displaystyle{ D_{\lambda }\omega _{e}^{2}\varDelta \theta \tau \,\cos \theta \simeq \sqrt{ \frac{3} {2\pi ^{2}}}\left (\frac{T_{S}} {T_{\!A}}\right ) \frac{1} {\sqrt{\varDelta \nu \tau ^{3}}}\;, }$$

(12.60)

where Δ θ is the field of view. For $\sqrt{2\pi ^{2 } /3}\cos \theta \simeq 1$, the field of view is

$$\displaystyle{ \varDelta \theta \simeq \frac{T_{S}} {D_{\lambda }T_{\!A}\omega _{e}^{2}\tau ^{2}\sqrt{\varDelta \nu \tau }}\;, }$$

(12.61)

or

$$\displaystyle{ \varDelta \theta \simeq \frac{1} {\mathcal{R}_{\,\mathrm{sn}}D_{\lambda }\omega _{e}^{2}\tau ^{2}}\;, }$$

(12.62)

where $\mathcal{R}_{\,\mathrm{sn}}$ is the signal-to-noise ratio. Let $\mathcal{R}_{\,\mathrm{sn}} = 10$ and τ = 100 s. The field of view is then about equal to 2000 times the fringe spacing. This restriction is often important. Usually when a feature is found, the phase center of the field is moved to the estimated position of the feature, and the position is then redetermined. Only components that are detected in individual observations on each baseline can be mapped with the fringe-frequency mapping technique. Thus, fringe-frequency mapping is less sensitive than synthesis mapping, in which fully coherent sensitivity is achieved.

The fringe-frequency analysis procedure can be extended to handle the case in which there are many point components in one frequency channel. From each observation (i.e., a measurement on one baseline lasting for a few minutes), the fringe-frequency spectrum is calculated. Multiple components will appear as distinct fringe-frequency features, as shown in Fig. 12.10. The fringe frequency of each feature defines a line in (Δ α′, Δ δ) space on which a maser component lies. The slope of the line is $\tan ^{-1}(\dot{v}/\dot{u})$. As the projected baseline changes, the slopes of the lines change. The intersections of the lines define the source positions (see Fig. 12.10). For this method to work, the components must be sufficiently separate to produce separate peaks in the fringe-frequency spectrum. The fringe-frequency resolution is about τ ⁻¹, which defines an effective beam of width

$$\displaystyle{ \varDelta \theta _{f} = \frac{1} {D_{\lambda }\omega _{e}\tau \cos \theta }\;. }$$

(12.63)

Fringe-frequency mapping is discussed in detail, for example, by Walker (1981). It remains a useful technique for arrays that involve instruments such as RadioAstron.

Notes

1.
For simplicity, we use the term geodetic to include geodynamic and static phenomena regarding the shape and orientation of the Earth.
2.
For simplicity, we use the term geodetic to include geodynamic and static phenomena regarding the shape and orientation of the Earth.

Author information

Authors and Affiliations

National Radio Astronomy Observatory, Charlottesville, Virginia, USA
A. Richard Thompson
Harvard Smithsonian Center for Astrophysics, Cambridge, Massachusetts, USA
James M. Moran
University of Illinois Urbana Champaign, Champaign, Illinois, USA
George W. Swenson Jr.

Authors

A. Richard Thompson
View author publications
You can also search for this author in PubMed Google Scholar
James M. Moran
View author publications
You can also search for this author in PubMed Google Scholar
George W. Swenson Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Appendices

Appendix 12.1 Least-Mean-Squares Analysis

The principles of least-mean-squares analysis play a fundamental role in astrometry, where the goal is to extract a number of parameters from a set of noisy measurements. We briefly discuss these principles in an elementary way, ignoring mathematical subtleties, and apply them to the problems encountered in interferometry. Detailed discussions of the statistical analysis of data can be found in books such as Bevington and Robinson (1992) and Hamilton (1964). The exhaustive treatment of how to fit a straight line, by Hogg et al. (2010), is highly recommended.

12.1.1 A12.1.1 Linear Case

Suppose we wish to measure a quantity m. We make a set of measurements y _i that are the sum of the desired quantity m and a noise contribution n _i:

$$\displaystyle{ y_{i} = m + n_{i}\;, }$$

(A12.1)

where n _i is a Gaussian random variable with zero mean and variance σ _i ². The probability that the ith measurement will take any specific value of y _i is given by the probability (density) function

$$\displaystyle{ p\,(y_{i}) = \frac{1} {\sqrt{2\pi }\sigma _{i}}\,e^{-(y_{i}-m)^{2}/2\sigma _{ i}^{2} }\;. }$$

(A12.2)

If all the measurements are independent, then the probability that an experiment will yield a set of N measurements y ₁, y ₂, …, y _N is

$$\displaystyle{ L =\prod \limits _{ i=1}^{N}p\,(y_{ i})\;, }$$

(A12.3)

where the ∏ denotes the product of the p (y _i) terms. L, viewed as a function of m, is called the likelihood function. The method of maximum likelihood is based on the assumption that the best estimate of m is the one that maximizes L. Maximizing L is the same as maximizing lnL, where

$$\displaystyle{ \ln L =\sum _{ i=1}^{N}\ln \frac{1} {\sqrt{2\pi }\sigma _{i}} -\frac{1} {2}\sum _{i=1}^{N}\frac{(y_{i} - m)^{2}} {\sigma _{i}^{2}} \;. }$$

(A12.4)

Since the first summation term on the right side of Eq. (A12.4) is a constant and the second summation term is multiplied by $-\frac{1} {2}$, the maximization of L is equivalent to the minimization of the second summation term in Eq. (A12.4) with respect to m. Thus, we wish to minimize the quantity χ ² given by

$$\displaystyle{ \chi ^{2} =\sum _{ i=1}^{N}\frac{(y_{i} - m)^{2}} {\sigma _{i}^{2}} \;. }$$

(A12.5)

In the more general problem discussed later in this appendix, m is replaced by a function with one or more parameters describing the system model. With this generalization, Eq. (A12.5) becomes the fundamental equation of the method of weighted least-mean-squares. In this method, the parameters of the model are determined by minimizing the sum of the squared differences between the measurements and the model, weighted by the variances of the measurements. The quantity χ ², which indicates the goodness of fit, is a random variable whose mean value equals the number of data points less the number of parameters when the model adequately describes the measurements. The method of least-mean-squares, appropriate when the noise is a Gaussian random process, is a special case of the more general method of maximum likelihood. Gauss invented the method of least-mean-squares, perhaps as early as 1795, using arguments similar to those given here, for the purpose of estimating the orbital parameters of planets and comets (Gauss 1809). The method was independently developed by Legendre in 1806 (Hall 1970).

Returning to Eq. (A12.5), we can estimate m by setting the derivative of χ ² with respect to m equal to zero. The resulting estimate of m, denoted by m _e, is

$$\displaystyle{ m_{e} = \frac{\sum \frac{y_{i}} {\sigma _{i}^{2}} } {\sum \frac{1} {\sigma _{i}^{2}}} \;, }$$

(A12.6)

where the summation goes from i = 1 to N. Using Eq. (A12.2), we note that 〈y _i〉 = m and 〈y _i ²〉 = m ² +σ _i ². Therefore, by calculating the expectation of Eq. (A12.6), it is clear that 〈m _e〉 = 〈y _i〉 = m, and it is easy to show that

$$\displaystyle{ \langle m_{e}^{2}\rangle = m^{2} + \left (\sum \frac{1} {\sigma _{i}^{2}}\right )^{-1}\;. }$$

(A12.7)

Hence the variance of the estimate of m _e is

$$\displaystyle{ \sigma _{m}^{2} =\langle m_{ e}^{2}\rangle -\langle m_{ e}\rangle ^{2} = \left (\sum \frac{1} {\sigma _{i}^{2}}\right )^{-1}\;. }$$

(A12.8)

Equation (A12.8) shows that when poor quality or noisy data are added to better data, the value of σ _m may be reduced only slightly. If the statistical error σ _i of each of the measurements has the same value, σ, then Eq. (A12.8) reduces to the well-known result

$$\displaystyle{ \sigma _{m} = \frac{\sigma } {\sqrt{N}}\;, }$$

(A12.9)

and m _e is the average of the measurements. In many instances, σ is not known. An estimate of σ is

$$\displaystyle{ \sigma _{e}^{2} = \frac{1} {N}\sum (y_{i} - m)^{2}\;. }$$

(A12.10)

However, m is not known, only its estimate, m _e. If m _e were used in place of m in Eq. (A12.10), the value of σ _e ² would be an underestimate of σ ² because of the manner in which m _e was determined in minimizing χ ². The unbiased estimate of σ ² is

$$\displaystyle{ \sigma _{e}^{2} = \frac{1} {N - 1}\sum (y_{i} - m_{e})^{2}\;. }$$

(A12.11)

It is easy to show by substitution of Eq. (A12.6) into Eq. (A12.11) that 〈σ _e ²〉 = σ ². The term N − 1, which is called the number of degrees of freedom, appears in Eq. (A12.11) because there are N data points and one free parameter.

Consider a model described by the function f(x ; p ₁, …, p _n), where x is the independent variable, which takes values x _i, where i = 1 to N, at the sample points, and p ₁, …, p _n are a set of parameters. We assume that the values of the independent variable are exactly known. If the function f correctly models the measurement system, the measurement set is given by

$$\displaystyle{ y_{i} = f(x_{i};p_{1},\ldots,p_{n}) + n_{i}\;, }$$

(A12.12)

where n _i represents the measurement error. The general problem is to find the values of the parameters for which χ ², given by the generalization of Eq. (A12.5),

$$\displaystyle{ \chi ^{2} =\sum \frac{\left [y_{i} - f(x_{i})\right ]^{2}} {\sigma _{i}^{2}} \;, }$$

(A12.13)

is a minimum.

A simple example of this problem is the fitting of a straight line to a data set. Let

$$\displaystyle{ f(x\,;a,b) = a + bx\;, }$$

(A12.14)

where a and b are the parameters to be found. Minimizing χ ² is accomplished by solving the equations

$$\displaystyle\begin{array}{rcl} \frac{\partial \chi ^{2}} {\partial a}& =& -\sum \frac{2(y_{i} - a - bx_{i})} {\sigma _{i}^{2}} = 0\;,{}\end{array}$$

(A12.15a)

and

$$\displaystyle\begin{array}{rcl} \frac{\partial \chi ^{2}} {\partial b}& =& -\sum \frac{2(y_{i} - a - bx_{i})x_{i}} {\sigma _{i}^{2}} = 0\,.{}\end{array}$$

(A12.15b)

In matrix notation, we have

$$\displaystyle{ \left [\begin{array}{*{10}c} \sum \frac{y_{i}} {\sigma _{i}^{2}}\\ \\ \\ \sum \frac{x_{i}y_{i}} {\sigma _{i}^{2}}\\ \end{array} \right ] = \left [\begin{array}{*{10}c} \sum \frac{1} {\sigma _{i}^{2}}\quad & \sum \frac{x_{i}} {\sigma _{i}^{2}}\\ \\ \\ \sum \frac{x_{i}} {\sigma _{i}^{2}} \quad & \sum \frac{x_{i}^{2}} {\sigma _{i}^{2}}\\ \end{array} \right ]\left [\begin{array}{*{10}c} a_{e}\\ \ \\ \ \\ \ \\ b_{e}\\ \end{array} \right ]\;, }$$

(A12.16)

where we distinguish between the true values of the parameters and their estimates by the subscript e. The solution is

$$\displaystyle{ a_{e} = \frac{1} {\varDelta } \left [\left (\sum \frac{x_{i}^{2}} {\sigma _{i}^{2}} \right )\left (\sum \frac{y_{i}} {\sigma _{i}^{2}} \right ) -\left (\sum \frac{x_{i}} {\sigma _{i}^{2}} \right )\left (\sum \frac{x_{i}y_{i}} {\sigma _{i}^{2}} \right )\right ] }$$

(A12.17)

and

$$\displaystyle{ b_{e} = \frac{1} {\varDelta } \left [\left (\sum \frac{1} {\sigma _{i}^{2}}\right )\left (\sum \frac{x_{i}y_{i}} {\sigma _{i}^{2}} \right ) -\left (\sum \frac{x_{i}} {\sigma _{i}^{2}} \right )\left (\sum \frac{y_{i}} {\sigma _{i}^{2}} \right )\right ]\;, }$$

(A12.18)

where Δ is the determinant of the square matrix in Eq. (A12.16), given by

$$\displaystyle{ \varDelta = \left (\sum \frac{1} {\sigma _{i}^{2}}\right )\left (\sum \frac{x_{i}^{2}} {\sigma _{i}^{2}} \right ) -\left (\sum \frac{x_{i}} {\sigma _{i}^{2}} \right )^{2}\;. }$$

(A12.19)

Estimates of the errors in the parameters a _e and b _e can be calculated from Eqs. (A12.17) and (A12.18) and are given by

$$\displaystyle{ \sigma _{a}^{2} =\langle a_{ e}^{2}\rangle -\langle a_{ e}\rangle ^{2} = \frac{1} {\varDelta } \sum \frac{x_{i}^{2}} {\sigma _{i}^{2}} }$$

(A12.20)

and

$$\displaystyle{ \sigma _{b}^{2} =\langle b_{ e}^{2}\rangle -\langle b_{ e}\rangle ^{2} = \frac{1} {\varDelta } \sum \frac{1} {\sigma _{i}^{2}}\;. }$$

(A12.21)

Note that a _e and b _e are random variables, and in general 〈a _e b _e〉 is not zero, so the parameter estimates are correlated. The error estimates in Eqs. (A12.20) and (A12.21) include the deleterious effects of the correlation between parameters. In this particular example, the correlation can be made equal to zero by adjusting the origin of the x axis so that ∑(x _i∕σ _i ²) = 0.

The above analysis can be used to estimate the accuracy of measurements of fringe frequency and delay made with an interferometer. Fringe frequency, the rate of change of fringe phase with time,

$$\displaystyle{ \nu _{f} = \frac{1} {2\pi } \frac{\partial \phi } {\partial t}\;, }$$

(A12.22)

can be estimated by fitting a straight line to a sequence of uniformly spaced measurements of phase with respect to time. The fringe frequency is proportional to the slope of this line. Assume that N measurements of phase ϕ _i, each having the same rms error σ _ϕ, are made at times t _i, spaced by interval T, running from time − NT∕2 to NT∕2, such that the total time of the observation is τ = NT. From Eq. (A12.21) and the above definitions, including Eq. (A12.22), the error in the fringe-frequency estimate is

$$\displaystyle{ \sigma _{f}^{2} = \frac{\sigma _{\phi }^{2}} {(2\pi )^{2}\sum t_{i}^{2}}\;, }$$

(A12.23)

since ∑ t _i = 0. The term ∑ t _i ² is approximately given by

$$\displaystyle{ \sum t_{i}^{2} \simeq \frac{1} {T}\int _{-\tau /2}^{\tau /2}t^{2}dt = \frac{1} {T} \frac{\tau ^{3}} {12} = \frac{N\tau ^{2}} {12} \;. }$$

(A12.24)

$\tau /\sqrt{12}$ can be thought of as the rms time span of the data. Thus, Eq. (A12.23) becomes

$$\displaystyle{ \sigma _{f}^{2} = \frac{12\sigma _{\!\phi }^{2}} {(2\pi )^{2}N\tau ^{2}}\;. }$$

(A12.25)

The expression for σ _ϕ, given in Eq. (6.64) for the case when the source is unresolved and there are no processing losses, is

$$\displaystyle{ \sigma _{\phi } = \frac{T_{S}} {T_{\!A}\sqrt{2\varDelta \nu T}}\;, }$$

(A12.26)

where T _S is the system temperature, T _A is the antenna temperature due to the source, and Δ ν is the bandwidth. Substitution of Eq. (A12.26) into Eq. (A12.25) yields

$$\displaystyle{ \sigma _{f} = \sqrt{ \frac{3} {2\pi ^{2}}}\left (\frac{T_{S}} {T_{\!A}}\right ) \frac{1} {\sqrt{\varDelta \nu \tau ^{3}}}\ \ \mathrm{(Hz)}\;. }$$

(A12.27)

Note that this result does not depend on the details of the analysis procedure, such as the choice of N. Equivalently, one can estimate the fringe frequency by finding the peak of the fringe-frequency spectrum, that is, the peak of the Fourier transform of $e^{j\phi _{i}}$.

The delay is the rate of change of phase with frequency,

$$\displaystyle{ \tau = \frac{1} {2\pi } \frac{\partial \phi } {\partial \nu }\;. }$$

(A12.28)

Thus, the delay can be estimated by finding the slope of a straight line fitted to a sequence of phase measurements as a function of frequency. For a single band, such data can be obtained from the cross power spectrum, the Fourier transform of the cross-correlation function. Assume that N measurements of phase are made at frequencies ν _i, each with a bandwidth Δ ν∕N and with an error σ _ϕ. In this calculation, only the relative frequencies are important. It is convenient for the purpose of analysis to set the zero of the frequency axis such that ∑ ν _i = 0. The error in delay [from Eqs. (A12.19), (A12.21), and (A12.28)] is

$$\displaystyle{ \sigma _{\tau }^{2} = \frac{\sigma _{\!\phi }^{2}} {(2\pi )^{2}\sum \nu _{i}^{2}}\;. }$$

(A12.29)

Using a calculation for ∑ ν _i ² analogous to the one in Eq. (A12.24), we can write Eq. (A12.29) as

$$\displaystyle{ \sigma _{\tau }^{2} = \frac{12\sigma _{\!\phi }^{2}} {(2\pi )^{2}N\varDelta \nu ^{2}}\;. }$$

(A12.30)

Thus, substitution of Eq. (A12.26) (with an integration time of τ and bandwidth Δ ν∕N) into Eq. (A12.30) yields

$$\displaystyle{ \sigma _{\tau } = \sqrt{ \frac{3} {2\pi ^{2}}}\left (\frac{T_{S}} {T_{\!A}}\right ) \frac{1} {\sqrt{\varDelta \nu ^{3}\tau }}\;. }$$

(A12.31)

We can define the rms bandwidth as

$$\displaystyle{ \varDelta \nu _{\mathrm{rms}} = \sqrt{ \frac{1} {N}\sum \nu _{i}^{2}} }$$

(A12.32)

and obtain from Eqs. (A12.26) and (A12.29) the result quoted in Sect. 9.8 [Eq. (9.179)],

$$\displaystyle{ \sigma _{\tau } = \frac{1} {\zeta } \left (\frac{T_{S}} {T_{\!A}}\right ) \frac{1} {\sqrt{\varDelta \nu _{\mathrm{rms } }^{3}\tau }}\;, }$$

(A12.33)

where ζ = π(768)^1∕4. (Note that in Sect. 9.8, σ _ϕ applies to the full bandwidth Δ ν.) The expressions for σ _τ in Eqs. (A12.30), (A12.31), and (A12.33) incorporate the condition $\varDelta \nu _{\mathrm{rms}} =\varDelta \nu /\sqrt{12}$ and apply to a continuous passband of width Δ ν.

In bandwidth synthesis, which is described in Sect. 9.8, the measurement system consists of N channels of width Δ ν∕N, which are not in general contiguous. The rms delay error is obtained by substituting Eqs. (A12.26) and (A12.32) into Eq. (A12.29), yielding

$$\displaystyle{ \sigma _{\tau } = \frac{1} {\sqrt{8\pi ^{2}}}\left (\frac{T_{S}} {T_{\!A}}\right ) \frac{1} {\sqrt{\varDelta \nu \tau }\,\varDelta \nu _{\mathrm{rms}}}\;, }$$

(A12.34)

where Δ ν _rms is given by Eq. (A12.32) and Δ ν is the total bandwidth. Δ ν _rms is generally equal to about 40% of the total frequency range spanned.

A general formulation of the linear least-mean-squares solution can be found when the model function f is a linear function of the parameters p _k, that is, when

$$\displaystyle{ f(x\,;p_{1},\ldots,p_{n}) =\sum \limits _{ k=1}^{n} \frac{\partial f} {\partial p_{k}}p_{k}\;, }$$

(A12.35)

where n is the number of parameters. For example, the model could be a cubic polynomial

$$\displaystyle{ f(x\,;p_{0},p_{1},p_{2},p_{3}) = p_{0} + p_{1}x + p_{2}x^{2} + p_{ 3}x^{3}\;, }$$

(A12.36)

in which case ∂ f∕∂ p _k = x ^k for k = 0, 1, 2, and 3. If the parameters appear as linear multiplicative factors, then the minimization of Eq. (A12.13) leads to a set of n equations of the form

$$\displaystyle{ \frac{\partial \chi ^{2}} {\partial p_{k}} = 0\;,\qquad k = 1,2,\ldots,n\,. }$$

(A12.37)

Substitution of Eq. (A12.13) into Eq. (A12.37) and use of Eq. (A12.35) yield the set of n equations

$$\displaystyle{ D_{k} =\sum \limits _{ j=1}^{n}T_{ jk}p_{j},\qquad k = 1,2,\ldots,n\;, }$$

(A12.38)

where

$$\displaystyle{ D_{k} =\sum \limits _{ i=1}^{N}\frac{y_{i}} {\sigma _{i}^{2}} \frac{\partial f(x_{i})} {\partial p_{k}} }$$

(A12.39)

and

$$\displaystyle{ T_{jk} =\sum \limits _{ i=1}^{N} \frac{1} {\sigma _{i}^{2}}\,\frac{\partial f(x_{i})} {\partial p_{j}} \,\frac{\partial f(x_{i})} {\partial p_{k}} \;, }$$

(A12.40)

and the summations are carried out over the set of N independent measurements. In matrix notation, the equation set (A12.38) is

$$\displaystyle{ [D] = [T][P_{e}]\;, }$$

(A12.41)

where [D] is a column matrix with elements D _k, [P _e] is a column matrix containing the estimates of the parameters p _ek, and [T] is a symmetric square matrix with elements T _jk. For obvious reasons, [T] is sometimes called the matrix of the normal equations. Note that Eq. (A12.41) is a generalization of Eq. (A12.16). The matrices [T] and [D] are sometimes written as the product of other matrices (Hamilton 1964, Ch. 4). Let [M] be the variance matrix (size N × N) whose diagonal elements are σ _i ² and whose off-diagonal elements are zero; let [F] be a column matrix containing the data y _i; and let [A] be the partial derivative matrix (size n × N) whose elements are ∂ f(x _i)∕∂ p _k. Then one can write [T] = [A]^T[M]⁻¹[A] and [D] = [A]^T[M]⁻¹[F], where [A]^T is the transpose of [A] and [M]⁻¹ is the inverse of [M]. The analysis can be generalized to include the situation in which the errors between measurements are correlated. In this case, [M] is modified to include off-diagonal elements σ _i σ _j ρ′_ij, where ρ′_ij is the correlation coefficient for the ith and jth measurements.

The solution to Eq. (A12.41) is

$$\displaystyle{ [P_{e}] = [T]^{-1}[D]\;, }$$

(A12.42)

where [T]⁻¹ is the inverse matrix of [T], and [P _e] is the column matrix containing the parameter estimates. The elements of [T]⁻¹ are denoted T′_jk. It can be shown by direct calculation that the estimates of the errors of the parameters σ _ek ² are the diagonal elements of [T]⁻¹, which is called the covariance matrix. Thus,

$$\displaystyle{ \sigma _{ek}^{2} = T'_{ kk}\;. }$$

(A12.43)

The probability that parameter p _k will be within ±σ _k of its true value is 0.68, which is the integral under the one-dimensional Gaussian probability distribution between ±σ _k. The probability that all of the n parameters will be within ±σ of their true values (i.e., within the error “box” in the n-dimensional space) is approximately 0.68ⁿ when the correlations are moderate.

The normalized correlation coefficients between parameters are proportional to the off-diagonal elements of [T]⁻¹:

$$\displaystyle{ \rho _{jk} = \frac{\left \langle (\,p_{ej} - p_{j})(\,p_{ek} - p_{k})\right \rangle } {\sigma _{ek}\,\sigma _{ej}} = \frac{T'_{jk}} {\sqrt{T'_{jj } T'_{kk}}}\;. }$$

(A12.44)

For any two parameters, there is a bivariate Gaussian probability distribution that describes the distribution of errors

$$\displaystyle{ p\,(\epsilon _{j},\epsilon _{k}) = \frac{1} {2\pi \sigma _{j}\sigma _{k}\sqrt{1 -\rho _{ jk }^{2}}}\exp \left \{- \frac{1} {2(1 -\rho _{jk}^{2})}\left [\frac{\epsilon _{j}^{2}} {\sigma _{j}^{2}} + \frac{\epsilon _{k}^{2}} {\sigma _{k}^{2}} -\frac{2\rho _{jk}\epsilon _{j}\epsilon _{k}} {\sigma _{j}\sigma _{k}} \right ]\right \}\;, }$$

(A12.45)

where ε _k = p _ek − p _k and ε _j = p _ej − p _j. The contour of p (ε _k, ε _j) = p(0, 0)e ^−1∕2 defines an ellipse, shown in Fig. A12.1, which is known as the error ellipse. The probability that both parameters will lie within the error ellipse is the integral of Eq. (A12.45) over the area of the error ellipse, which equals 0.46. The orientation of the error ellipse is given by

$$\displaystyle{ \psi _{jk} = \frac{1} {2}\tan ^{-1}\left ( \frac{2\rho _{jk}\sigma _{j}\sigma _{k}} {\sigma _{j}^{2} -\sigma _{k}^{2}}\right )\;. }$$

(A12.46)

The errors in the parameters p _k are completely determined by the matrix [T]⁻¹ through Eqs. (A12.43)–(A12.45). The elements of [T]⁻¹ depend only on the partial derivatives of the model function and the values of the measurement errors, which can usually be predicted in advance from the characteristics of the measurement apparatus. Therefore, once an experiment is planned, the errors in the parameters can be predicted from [T]⁻¹ without reference to the data. For this reason, [T] is sometimes called the design matrix. Studies of the design matrix for a specific experiment might reveal a very high correlation between two parameters, leading to large errors in their estimated values. It is often possible to modify the experiment to obtain more data that will reduce the correlation. After the data are analyzed, the value of χ ² can be computed. If the model is a good fit to the data, χ ² should be approximately equal to N − n, the number of measurements minus the number of parameters. If it is not, the difficulty is often that the values of σ _i are estimated incorrectly or that the model does not describe adequately the measurement system, that is, the model has too few parameters or is not correct. Even if χ ² ≃ N − n, the derived errors in Eq. (A12.43) may not be realistic, and they are referred to as “formal errors.” The formal errors describe the precision of the parameter estimates.The accuracy of the parameter measurements is the deviation between the estimates of the parameters and the true values of the parameters. The accuracy of the measurements is often difficult to determine. For example, an unknown effect that closely mimics the functional dependence of one of the model parameters may be present in an experiment. The model may appear to be a good one, but the accuracy of the particular model parameter in question will be much poorer than expected because of the systematic error introduced by the unmodeled effect.

We can envision how the principles of least-mean-squares analysis are applied to a large astrometric experiment. Consider a hypothetical VLBI experiment made on a three-station array. Suppose that ten recordings are made of each of 20 sources during observations made over one day (an epoch). The observations are repeated six times a year for five years. The data set would consist of 18,000 measurements (20 sources × 10 observations × 3 baselines × 30 epochs) of delay and fringe frequency, or 36,000 total measurements. The measurements of delay and fringe frequency can be combined in the analysis since, in the least-mean-squares method, the relevant quantities are the squares of the measurements divided by their variances, which are dimensionless, as in Eq. (A12.13). Now we can count the number of parameters in the analysis model: 39 source coordinates (1 right ascension fixed), 9 station coordinates, 90 atmospheric parameters (a zenith excess path length at each station at each epoch), 120 clock parameters (a clock error and clock rate error at two of the stations per epoch), and 90 polar motion and UT1–UTC parameters, as well as several other parameters to model precession, nutation, solid-Earth tides, gravitational deflection by the Sun, movement of stations, and other effects such as antenna axis offsets (see Sect. 4.6.1). The total number of parameters is about 360. The parameters within each observation epoch are linked because of the common clock and atmosphere parameters. Parameters among epochs are linked because of baseline, precession, and nutation parameters. Naturally, partial solutions from subsets of the data should be obtained before a grand global solution is attempted. Procedures are available for obtaining global solutions that do not require the inversion of matrices as large as the total number of parameters [see, e.g., Morrison (1969)]. Experiments of the scale described here, and larger ones, have been carried out [e.g., Fanselow et al. (1984), Herring et al. (1985), and Ma et al. (1998)].

12.1.2 A12.1.2 Nonlinear Case

The discussion of linear least-mean-squares analysis can be generalized to include nonlinear functions in a straightforward manner. Assume that f(x ; p) has one nonlinear parameter p _n. For the purpose of discussion, we can separate f into linear and nonlinear parts, f _L(x ; p ₁, …, p _n−1) and f _NL(x ; p _n), and approximate the nonlinear function by the first two terms in a Taylor expansion

$$\displaystyle{ f_{NL}(x\,;p_{n}) \simeq f_{NL}\,(x,p_{0n}) + \frac{\partial f_{NL}} {\partial p_{n}} \varDelta p_{n}\;, }$$

(A12.47)

where p _0n is the initial guess of parameter p _n and Δ p _n = p _n − p _0n. We assume that the initial parameter guesses are accurate enough for Eq. (A12.47) to be valid. We replace the data with y _i − f _NL(x _i; p _0n) and then compute the elements of the matrices [D] and [T] from the partial derivatives, including ∂ f _NL∕∂ p _n. The nth parameter in the matrix [P _e] in Eq. (A12.42) will be the differential parameter Δ p _n defined in Eq. (A12.47). The solution must be iterated with a new Taylor expansion centered on the parameter p _0n +Δ p _n. Thus, nonlinear functions can be accommodated in the analysis through linearization, but initial guesses of the nonlinear parameters and solution iteration are required. In some cases, nonlinear estimation problems can cause difficulties [see, e.g., Lampton et al. (1976), Press et al. (1992)]. Recently, the use of the Markov chain Monte Carlo (MCMC) method has become almost universal (Sivia and Skilling 2006).

12.1.3 A12.1.3 (u, v) vs. Image Plane Fitting

One final topic concerns the estimation of the coordinates of a radio source with a well-calibrated interferometer, which has accurately known baselines and instrumental phases. In this case, the differential interferometer phase is, from Eq. (12.2),

$$\displaystyle\begin{array}{rcl} \varDelta \phi & =& \;2\pi D_{\lambda }\left \{\left [\sin d\cos \delta -\cos d\sin \delta \cos (H - h)\right ]\varDelta \delta \right. \\ & & \left.+\cos d\cos \delta \sin (H - h)\varDelta \alpha \right \}\;.{}\end{array}$$

(A12.48)

Expressing the geometric quantities in terms of projected baseline components, we can write Eq. (A12.48) as

$$\displaystyle{ \varDelta \phi = 2\pi \,(u\varDelta \alpha ' + v\varDelta \delta )\;, }$$

(A12.49)

where Δ α′ = Δ αcosδ. A set of phase measurements from one or more baselines can be analyzed by the method of least-mean-squares to determine Δ α′ and Δ δ. The partial derivatives are ∂ f∕∂ p ₁ = 2π u and ∂ f∕∂ p ₂ = 2π v, where p ₁ = Δ α′ and p ₂ = Δ δ. From Eqs. (A12.40) and (A12.49), the normal-equation matrix is

$$\displaystyle{ [T] = \frac{4\pi ^{2}} {\sigma _{\phi }^{2}} \left [\begin{array}{*{10}c} \sum u_{i}^{2}\quad & \sum u_{ i}v_{i}\\ \\ \\ \sum u_{i}v_{i}\quad & \sum v_{i}^{2} \\ \end{array} \right ]\;, }$$

(A12.50)

where all the measurements are assumed to have the same uncertainty σ _ϕ given by Eq. (A12.26). The inverse of [T] is

$$\displaystyle{ [T]^{-1} = \frac{1} {\varDelta } \left [\begin{array}{*{10}c} \sum v_{i}^{2}\quad & -\sum u_{ i}v_{i}\\ \\ \\ -\sum u_{i}v_{i}\quad & \sum u_{i}^{2} \\ \end{array} \right ]\;, }$$

(A12.51)

where Δ is the determinant of the matrix in Eq. (A12.50),

$$\displaystyle{ \varDelta = \frac{4\pi ^{2}} {\sigma _{\phi }^{2}} \left [\sum u_{i}^{2}\sum v_{ i}^{2} -\left (\sum u_{ i}v_{i}\right )^{2}\right ]\;. }$$

(A12.52)

The correlation coefficient defined by Eq. (A12.44) is

$$\displaystyle{ \rho _{12} = \frac{-\sum u_{i}v_{i}} {\sqrt{\sum u_{i }^{2 }\sum v_{i }^{2}}}\;. }$$

(A12.53)

The variances of the estimates of the parameters are given by the diagonal elements of Eq. (A12.51),

$$\displaystyle{ \sigma _{\alpha '}^{2} = \frac{\sigma _{\phi }^{2}\sum v_{ i}^{2}} {4\pi ^{2}\left [\sum v_{i}^{2}\sum u_{i}^{2} -\left (\sum u_{i}v_{i}\right )^{2}\right ]}\;, }$$

(A12.54)

and

$$\displaystyle{ \sigma _{\delta }^{2} = \frac{\sigma _{\phi }^{2}\sum u_{ i}^{2}} {4\pi ^{2}\left [\sum v_{i}^{2}\sum u_{i}^{2} -\left (\sum u_{i}v_{i}\right )^{2}\right ]}\;. }$$

(A12.55)

If the (u, v) loci are long (that is, the observations extend over a large fraction of the day), then ∑ u _i v _i will be small compared to ∑ u _i ² and ∑ v _i ² so that

$$\displaystyle{ \sigma _{\alpha '} \simeq \frac{\sigma _{\phi }} {2\pi \sqrt{\sum u_{i }^{2}}}\;, }$$

(A12.56)

and

$$\displaystyle{ \sigma _{\delta } \simeq \frac{\sigma _{\phi }} {2\pi \sqrt{\sum v_{i }^{2}}}\;. }$$

(A12.57)

Furthermore, if only one baseline is used on a high-declination source, then u _i ≃ v _i ≃ D _λ, and both errors reduce to the intuitive result

$$\displaystyle{ \sigma _{\alpha '} \simeq \sigma _{\delta }\simeq \frac{\sigma _{\phi }} {2\pi \sqrt{N}D_{\lambda }}\;. }$$

(A12.58)

Alternately, the source position can be found by Fourier transformation of the visibility data. This procedure can be thought of as image plane fitting or as multiplying the visibility data by the exponential factors exp[2π (u _i Δ α′ + v _i Δ δ)] and summing over the data. The resulting “function” is maximized with respect to Δ α′ and Δ δ. In this latter view, it is easy to understand that (basic) image plane fitting (that is, no tapering or gridding of the data) is a maximum-likelihood procedure for finding the position of a point source and therefore formally equivalent to the method of least-mean-squares. The synthesized beam b ₀ for N measurements is

$$\displaystyle{ b_{0}(\varDelta \alpha ',\varDelta \delta ) = \frac{1} {N}\sum \cos \left [2\pi \,(u_{i}\varDelta \alpha ' + v_{i}\varDelta \delta )\right ]\;. }$$

(A12.59)

The shape of b ₀ near its peak can be found by expanding Eq. (A12.59) to second order:

$$\displaystyle{ b_{0}(\varDelta \alpha ',\varDelta \delta ) \simeq 1 -\frac{2\pi ^{2}} {N}\Bigg(\varDelta \alpha '^{2}\sum u_{ i}^{2} +\varDelta \delta ^{2}\sum v_{ i}^{2} - 2\varDelta \alpha '\varDelta \delta \sum u_{ i}v_{i}\Bigg)\;. }$$

(A12.60)

From Eq. (A12.60), it is easy to see that the contours of the synthesized beam are proportional to the error ellipse defined by Eqs. (A12.45), (A12.46), and (A12.53)–(A12.55). Note that the method of least-mean-squares can be applied only in the regime of high signal-to-noise ratio, where phase ambiguities can be resolved. However, the Fourier synthesis method can be applied in any case.

Second-Order Effects in Phase Referencing

We present a more general analysis of how an error in the position of a calibrator source affects the determination of the position of a target source. Suppose the phase of an interferometer is referenced to its tracking center (corresponding to θ _c in Eq. (12.33). If the calibrator has coordinate errors x _c = Δ α _ccosδ _c and y _c = Δ δ _c, then the residual phase is

$$\displaystyle{ \varDelta \phi _{c_{1}} = 2\pi \,(u_{c}x_{c} + v_{t}y_{c})\;. }$$

(A12.61)

This causes a shift in phase at the position of the target of

$$\displaystyle{ \varDelta \phi _{c_{2}} = 2\pi \,(u_{t}x_{c} + v_{t}y_{c})\;. }$$

(A12.62)

Since the (u, v) coordinates are slightly different, there will be a second-order phase shift of $\varDelta ^{2}\phi =\varDelta \phi _{c_{2}} -\varDelta \phi _{c_{1}}$:

$$\displaystyle\begin{array}{rcl} \varDelta ^{2}\phi & =& 2\pi \,[(u_{ t} - u_{c})x_{c} + (v_{1} - v_{2})y_{c}] \\ & =& 2\pi \,(\varDelta ux_{c} +\varDelta vy_{c})\;. {}\end{array}$$

(A12.63)

This leads to the same approximation given in Eq. (12.39). A complete expression for Eq. (A12.63) can be derived by calculating the differential quantities Δ u and Δ v from Eq. (4.3).

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Thompson, A.R., Moran, J.M., Swenson, G.W. (2017). Interferometer Techniques for Astrometry and Geodesy. In: Interferometry and Synthesis in Radio Astronomy. Astronomy and Astrophysics Library. Springer, Cham. https://doi.org/10.1007/978-3-319-44431-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-44431-4_12
Published: 23 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44429-1
Online ISBN: 978-3-319-44431-4
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics

Interferometer Techniques for Astrometry and Geodesy

Abstract

12.1 Requirements for Astrometry

12.1.1 Reference Frames

12.2 Solution for Baseline and Source-Position Vectors

12.2.1 Phase Measurements

12.2.2 Measurements with VLBI Systems

12.2.3 Phase Referencing (Position)

12.2.4 Phase Referencing (Frequency)

12.3 Time and Motion of the Earth

12.3.1 Precession and Nutation

12.3.2 Polar Motion

12.3.3 Universal Time

12.3.4 Measurement of Polar Motion and UT1

12.4 Geodetic Measurements

12.5 Proper Motion and Parallax Measurements

12.6 Solar Gravitational Deflection

12.7 Imaging Astronomical Masers

Notes

Further Reading

References

Author information

Authors and Affiliations

Appendices

Appendix 12.1 Least-Mean-Squares Analysis

12.1.1 A12.1.1 Linear Case

12.1.2 A12.1.2 Nonlinear Case

12.1.3 A12.1.3 (u, v) vs. Image Plane Fitting

Second-Order Effects in Phase Referencing

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Abstract

12.1 Requirements for Astrometry

12.1.1 Reference Frames

12.2 Solution for Baseline and Source-Position Vectors

12.2.1 Phase Measurements

12.2.2 Measurements with VLBI Systems

12.2.3 Phase Referencing (Position)

12.2.4 Phase Referencing (Frequency)

12.3 Time and Motion of the Earth

12.3.1 Precession and Nutation

12.3.2 Polar Motion

12.3.3 Universal Time

12.3.4 Measurement of Polar Motion and UT1

12.4 Geodetic Measurements

12.5 Proper Motion and Parallax Measurements

12.6 Solar Gravitational Deflection

12.7 Imaging Astronomical Masers

Notes

Further Reading

References

Author information

Authors and Affiliations

Appendices

Appendix 12.1 Least-Mean-Squares Analysis

12.1.1 A12.1.1 Linear Case

12.1.2 A12.1.2 Nonlinear Case

12.1.3 A12.1.3 (u, v) vs. Image Plane Fitting

Second-Order Effects in Phase Referencing

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation