As we had mentioned in the introduction, the Schwartz approach to distribution theory defines distributions as continuous linear functions on a test function space. The various classes of distributions are distinguished by the underlying test function spaces. Before we come to the definition of the main classes of Schwartz distribution, we collect some basic facts about continuous linear functions or functionals on a Hausdorff locally convex topological vector space (HLCTVS) and about spaces of such functionals. Then the definition of the three main spaces of Schwartz distributions is straightforward. Numerous examples explain this definition.

The remainder of this chapter introduces convergence of sequences and series of distributions and discusses localization, in particular, support and singular support of distributions.

1 The Topological Dual of an HLCTVS

Suppose that X is a vector space over the field \(\mathbb K\) on which a filtering system \(\mathcal{P}\) of seminorms is given such that \(X[\mathcal{P}]\) is an HLCTVS. The algebraic dual \(X^*\) of X has been defined as the set of all linear functions or functionals \(f:X\to \mathbb K\). The topological dual is defined as the subset of those linear functions which are continuous, i.e.,

$$X^{\prime} \equiv X[\mathcal{P}]^{\prime} =\left\{f \in X^*: f\;{\rm continuous}\right\}$$
(3.1)

In a natural way, both \(X^*\) and \(X^{\prime}\) are vector spaces over \(\mathbb K\). As a special case of Theorem 2.4, the following result is a convenient characterization of the elements of the topological dual of a HLCTVS.

Proposition 3.1

Suppose that \(X[\mathcal{P}]\) is a HLCTVS and \(f:X \to \mathbb K\) a linear function. Then the following statements are equivalent.

  1. (a)

    f is continuous, i.e. \(f \in X^{\prime}\) .

  2. (b)

    There is a seminorm \(p\in \mathcal{P}\) and a nonnegative number λ such that \(|f(x)| \leq \lambda p(x)\) for all \(x \in X\) .

  3. (c)

    There is a seminorm \(p \in \mathcal{P}\) such that f is bounded on the p-ball \(B_{p,1}(0)\) .

Proof

The equivalence of statements (a) and (b) is just the special case \(Y[\mathcal{Q}]=\mathbb K[\!\left\{|\cdot|\right\}\!]\) of Theorem 2.4.

The equivalence of (b) and (c) follows easily from Lemma 2.2 if we introduce the seminorm \(q(x)=|f(x)|\) on X and if we observe that then (b) says \(q\leq \lambda p\) while (c) translates into \(B_{p,1}(0)\subseteq B_{q,\lambda}(0)\). □

The geometrical interpretation of linear functionals is often helpful, in particular in infinite dimensional spaces. We give a brief review. Recall: A hyperplane through the origin is a maximal proper subspace of a vector space X. If such a hyperplane is given, there is a point \(a \in X\backslash H\) such that the vector space X over the field \(\mathbb K\) has the representation

$$X=H +\mathbb K a,$$

i.e., every point \(x\in X\) has the unique representation \(x=h+\alpha a\) with \(h\in H\) and \(\alpha \in \mathbb K\). The announced geometrical characterization now is

Proposition 3.2

Let \(X[\mathcal{P}]\) be a HLCTVS over the field \(\mathbb K\) .

  1. (a)

    A linear functional \(f \in X^*\) , \(f\neq 0\) , is characterized by

    1. (i)

      a hyperplane \(H\subset X\) through the origin and

    2. (ii)

      the value in a point \(x_0 \in X\backslash H\) .

    The connection between the functional f and the hyperplane is given by

    $$H=\,{\rm ker}\,{f}=\left\{x\in X:f(x)=0\right\}\!.$$
  2. (b)

    A linear functional f on X is continuous if, and only if, in the geometric characterization a) the hyperplane H is closed.

Proof

Given \(f\in X^*\), the kernel or null space ker f is easily seen to be a linear subspace of X. Since \(f\neq 0\), there is a point in X at which f does not vanish. By rescaling this point we get a point \(a \in X\backslash \textrm{ker}\,{f}\) with \(f(a)=1\). We claim that \(H=\,{\rm ker}\,{f}\) is a hyperplane. Given any point \(x \in X\) observe \(x=x-f(x)a + f(x)a\) where \(h=x-f(x)a \in \,\mathrm{ker}\,{f}\) since \(f(h)=f(x)-f(x)f(a)=0\) and \(f(x)a \in \mathbb K a\). The representation \(x=h+\alpha a\) with \(h\in \;\textrm{ker}\;{f}\) and \(\alpha \in \mathbb K\) is unique: If one has, for some \(x \in X\), \(x=h_1 +\alpha_1 a =h_2 +\alpha_2 a\) with \(h_i \in \;\textrm{ker}\;{f}\) then \(h_1 -h_2=(\alpha_1 -\alpha_2)a\) and thus \(0=f(h_1 -h_2)=(\alpha_1- \alpha_2)f(a)= \alpha_1 - \alpha_2\), hence \(\alpha_1=\alpha_2\) and \(h_1=h_2\).

Conversely, assume that H is a hyperplane through the origin and \(a\in X\backslash H\). Then every point \(x\in X\) has the unique representation \(x=h + \alpha a\) with \(h \in H\) and \(\alpha \in \mathbb K\). Now define \(f_H:X\to \mathbb K\) by \(f_H(x)=f_H(h+\alpha a)=\alpha\). It is an elementary calculation to show that f H is a well defined linear function. Certainly one has \(\textrm{ker}\;{f_H}=H\). This proves part (a).

In order to prove part (b), we have to show that \(H=\;\textrm{ker}\;{f}\) is closed if, and only if, the linear functional f is continuous. When f is continuous then \(\textrm{ker}\;{f}\) is closed as the inverse image of the closed set \(\left\{0\right\}\). Conversely, assume that \(H=\;\textrm{ker}\;{f}\) is closed. Then its complement \(X\backslash H\) is open and there is some open p-ball \(B_{p,r}(a) \subset X \backslash H\) around the point a, \(f(a)=1\). In order to prove continuity of f it suffices, according to Proposition 3.1, to show that f is bounded on the open ball \(B_{p,r}(0)\). This is done indirectly. If there were some \(x \in B_{p,r}(0)\) with \(|f(x)|\geq 1\) then \(y=a-\frac{x}{f(x)} \in B_{p,r}(a)\) and \(f(y)=f(a)-\frac{f(x)}{f(x)}=1-1=0\), i.e., \(y\in H\), a contradiction. Therefore, f is bounded on \(B_{p,r}(0)\) by 1 and we conclude. □

2 Definition of Distributions

For an open nonempty subset \(\Omega\subset \mathbb{R}^n\), we have introduced the test function spaces \(\mathcal{D}(\Omega)\), \({\mathcal S}(\Omega)\), and \(\mathcal{E}(\Omega)\) as HLCTVSs. Furthermore the relation

$$\mathcal{D}(\Omega) \subset{\mathcal S}(\Omega) \subset \mathcal{E}(\Omega)$$

with continuous embeddings in both cases has been established (see Theorem 2.8). This section gives the basic definitions of the three basic classes of distributions as elements of the topological dual space of these test function spaces. Elements of the topological dual \(\mathcal{D}^{\prime}(\Omega)\) of \(\mathcal{D}(\Omega)\) are called distributions on Ω. Elements of the topological dual \({\mathcal S}^{\prime}(\Omega)\) of \({\mathcal S}(\Omega)\) are called tempered distributions and elements of topological \(\mathcal{E}^{\prime}(\Omega)\) of \(\mathcal{E}(\Omega)\) are called distributions of compact support. Later, after further preparation, the names for the latter two classes of distributions will be apparent. The continuous embeddings mentioned above imply the following relation between these three classes of distributions and it justifies calling elements in \({\mathcal S}^{\prime}(\Omega)\), respectively in \(\mathcal{E}^{\prime}(\Omega)\), distributions:

$$\mathcal{E}^{\prime}(\Omega)\subset{\mathcal S}^{\prime}(\Omega) \subset \mathcal{D}^{\prime}(\Omega).$$
(3.2)

We proceed with a more explicit discussion of distributions.

Definition 3.1

A distribution T on an open nonempty subset \(\Omega \subset \mathbb{R}^n\) is a continuous linear functional on the test function space \(\mathcal{D}(\Omega)\) of \(\mathcal{C}^{\infty}\)-functions of compact support. The set of all distributions on Ω equals the topological dual \(\mathcal{D}^{\prime}(\Omega)\) of \(\mathcal{D}(\Omega)\).

Another way to define a distribution on a nonempty open subset \(\Omega \subset \mathbb{R}^n\) is to recall Proposition 2.6 and to define: A linear functional T on \(\mathcal{D}(\Omega)\) is a distribution on Ω if, and only if, its restriction to the spaces \(\mathcal{D}_K(\Omega)\) is continuous for every compact subset \(K\subset \Omega\). Taking Theorem 2.4 into account one arrives at the following characterization of distributions.

Theorem 3.1

A linear functional \(T:\mathcal{D}(\Omega) \to \mathbb K\) is a distribution on the open nonempty set \(\Omega\subset \mathbb{R}^n\) if, and only if, for every compact subset \(K\subset \Omega\) there exist a number \(C\in{\mathbb R}^{+}\) and a natural number \(m\in\mathbb N\) , both depending in general on K and T , such that for all \(\phi \in \mathcal{D}_K(\Omega)\) the estimate

$$|T(\phi)|\leq C p_{K,m}(\phi)$$
(3.3)

holds.

An equivalent way to express this is the following:

Corollary 3.1

A linear function \(T: \,\mathcal{D}(\Omega) \to \mathbb K\) is a distribution on Ω if, and only if, for every compact subset \(K \subset \Omega\) there is an integer m such that

$$p^{\prime}_{K,m}(T) = \sup\left\{|T(\phi)|: \phi \in \mathcal{D}_K(\Omega),\;p_{K,m}(\phi)\leq 1\right\}$$
(3.4)

is finite and then

$$|T(\phi)|\leq p^{\prime}_{K,m}(T) p_{K,m}(\phi) \qquad \forall\, \phi \in \mathcal{D}_K(\Omega).$$

The proof of the corollary is left as an exercise. This characterization leads to the important concept of the order of a distribution.

Definition 3.2

Let T be a distribution on \(\Omega \subset \mathbb{R}^n\), Ω open and nonempty, and let \(K\subset \Omega\) be a compact subset. Then the local order \(O(T,K)\) of \(T\) on \(K\) is defined as the minimum of all natural numbers m for which (3.3) holds. The order \(O(T)\) of T is the supremum over all local orders.

In terms of the concept of order, Theorem 3.1 says: Locally every distribution is of finite order, i.e. a finite number of derivatives of the test functions φ are used in the estimate (3.3) (recall the definition of the seminorms \(p_{K,m}\) in Eq. (2.1)).

Remark 3.1

  1. 1.

    As the topological dual of the HLCTVS \(\mathcal{D}(\Omega)\), the set of all distributions on an open set \(\Omega \subset \mathbb{R}^n\) forms naturally a vector space over the field \(\mathbb K\). Addition and scalar multiplication are explicitly given as follows: For all \(T, T_i \in \mathcal{D}^{\prime}(\Omega)\) and all \(\lambda \in \mathbb K\),

    $$\forall_{\phi \in \mathcal{D}(\Omega)} (T_1+T_2)(\phi)=T_1(\phi) + T_2(\phi), \qquad (\lambda T)(\phi)=\lambda T(\phi).$$

    Thus, \((T,\phi) \mapsto T(\phi)\) is a bilinear function \(\mathcal{D}^{\prime} \times \mathcal{D} \to \mathbb K\).

  2. 2.

    According to their definition, distributions assign real or complex numbers \(T(\phi)\) to a test function \(\phi\in \mathcal{D}(\Omega)\). A frequently used alternative notation for the value \(T(\phi)\) of the function T is

    $$T(\phi)=\langle T,\phi \rangle= \langle T(x),\phi(x) \rangle.$$
  3. 3.

    In physics textbooks one often finds the notation \(\int_{\Omega} T(x)\phi(x) {\textrm d} x\) for the value \(T(\phi)\) of the distribution T at the test function φ. This suggestive notation is rather formal since when one wants to make sense out of this expression the integral sign used has little to do with the standard integrals (further details are provided in the section on representation of distributions as “generalized” derivatives of continuous functions).

  4. 4.

    The axiom of choice allows us to show that there are linear functionals on \(\mathcal{D}_K(\Omega)\) which are not continuous. But nobody has succeeded in giving an explicit example of such a noncontinuous functional. Thus, in practice one does not encounter these exceptional functionals.

  5. 5.

    One may wonder why we spoke about \(\mathcal{D}(\Omega)\) as the test function space of distribution theory. Naturally, \(\mathcal{D}(\Omega)\) is not given à priori. One has to make a choice. The use of \(\mathcal{D}(\Omega)\) is justified à posteriori by many successful applications. Nevertheless, there are some guiding principles for the choice of test function spaces (compare the introductory remarks on the goals of distribution theory).

    1. a)

      The choice of test function spaces as subspaces of the space of \(\mathcal{C}^{\infty}\)-functions on which all derivative monomials \(D^{\alpha}\) act linearly and continuously ensure that all distributions will be infinitely often differentiable too.

    2. b)

      Further restrictions on the subspace of \(\mathcal{C}^{\infty}\)-functions as a test function space depends on the intended use of the resulting space of generalized functions. For instance, the choice of \(\mathcal{C}^{\infty}\)-functions on Ω with compact support ensures that the resulting distributions on Ω are not restricted in their behavior at the boundary of the set Ω. Later we will see that the test function space of \(\mathcal{C}^{\infty}\)-functions which are strongly decreasing ensures that the resulting space of generalized functions admits the Fourier transformation as an isomorphism, which has many important consequences.

A number of concrete Examples will help to explain how the above definition operates in concrete cases. The first class of examples show furthermore how distributions generalize functions so that it is appropriate to speak about distributions as special classes of generalized functions. Later, we will give an overview of some other classes of generalized functions.

2.1 The Regular Distributions

Suppose that \(f:\Omega \to \mathbb K\) is a continuous function on the open nonempty set \(\Omega\subset \mathbb{R}^n\). Then, for every compact subset \(K\subset \Omega\) the (Riemann) integral \(\int_K |f(x)|{\textrm d} x=C\) is known to exist. Hence, for all \(\phi \in \mathcal{D}_K(\Omega)\) one has

$$\left|\int_K f(x)\phi(x){\textrm d} x\right|\leq \int_K |f(x)\phi(x)|{\textrm d} x \leq \sup_{x\in K} |\phi(x)|\int_K |f(x)|{\textrm d} x.$$

It follows that \(I_f:\mathcal{D}(\Omega) \to \mathbb K\) is well defined by

$$\langle I_f,\phi \rangle = \int f(x) \phi(x){\textrm d} x \qquad \forall\,\phi \in \mathcal{D}(\Omega)$$

and that for all \(\phi \in \mathcal{D}_K(\Omega)\) one has the estimate

$$|\langle I_f,\phi\rangle|\leq C p_{K,0}(\phi).$$

Elementary properties of the Riemann integral imply that I f is a linear functional on \(\mathcal{D}(\Omega)\). Since we could establish the estimate (3.3) in Theorem 3.1, it follows that I f is continuous and thus a distribution on Ω. In addition this estimate shows that the local order and the order of the distribution I f is 0.

Obviously these considerations apply to any \(f \in \mathcal{C}(\Omega)\). Therefore, \(f\mapsto I_f\) defines a map \(I: \mathcal{C}(\Omega) \to \mathcal{D}^{\prime}(\Omega)\) which is easily seen to be linear. In the Exercises it is shown that I is injective and thus provides an embedding of the space of all continuous functions into the space of distributions.

Note that the decisive property we used for the embedding of continuous functions into the space of distributions was that, for \(f \in \mathcal{C}(\Omega)\) and every compact subset, the Riemann integral \(C=\int_K|f(x)|{\textrm d} x\) is finite. Therefore, the same ideas allow us to consider a much larger space of functions on Ω as distributions, namely the space \(L^1_{loc}(\Omega)\) of all locally integrable functions on Ω. \(L^1_{loc}(\Omega)\) is the space of all (equivalence classes of) Lebesgue’s measurable functions on Ω for which the Lebesgue integral

$$\left\vert\vert{f}\right\vert\vert_{1,K} =\int_K |f(x)|{\textrm d} x$$
(3.5)

is finite for every compact subset \(K \subset \Omega\). Thus, the map I can be extended to a map \(I:L^1_{loc}(\Omega) \to \mathcal{D}^{\prime}(\Omega)\) by the same formula: For every \(f \in L^1_{loc}(\Omega)\) define \(I_f:\mathcal{D}(\Omega)\to \mathbb K\) by

$$I_f(\phi)=\int f(x)\phi(x){\textrm d} x \qquad \forall \,\phi \in \mathcal{D}(\Omega).$$

The bound \(|I_f(\phi)|\leq \left\vert{f}\right\vert_{1,K}\, p_{K,0}(\phi)\) for all \(\phi \in \mathcal{D}_K(\Omega)\) proves as above that \(I_f \in \mathcal{D}^{\prime}(\Omega)\) for all \(f\in L^1_{loc}(\Omega)\). A simple argument implies that I is a linear map and in the Exercises we prove that I is injective, i.e., \(I_f=0\) in \(\mathcal{D}^{\prime}(\Omega)\) if, and only if, f = 0 in \(L^1_{loc}(\Omega)\). Therefore, I is an embedding of \(L^1_{loc}(\Omega)\) into \(\mathcal{D}^{\prime}(\Omega)\). The space \(L^1_{loc}(\Omega)\) is an HLCTVS when it is equipped with the filtering system of seminorms \({\left\{\vert{\cdot}_{1,K}\vert: K\subset \Omega,\;{\rm compact}\right\}}\). With respect to this topology, the embedding I is continuous in the following sense. If \((f_j)_{j\in \mathbb N}\) is a sequence which converges to zero in \(L^1_{loc}(\Omega)\), then, for every \(\phi \in \mathcal{D}(\Omega)\), one has \(\lim_{j\to \infty}{I_{f_j}}(\phi)=0\) which follows easily from the bound given above. We summarize our discussion as the so-called embedding theorem.

Theorem 3.2

The space \(L^1_{loc}(\Omega)\) of locally integrable functions on an open nonempty set \(\Omega\subset \mathbb{R}^n\) is embedded into the space \(\mathcal{D}^{\prime}(\Omega)\) of distributions on Ω by the linear and continuous injection I . The image of \(L^1_{loc}(\Omega)\) under I is called the space of regular distributions on Ω:

$$\mathcal{D}^{\prime}_{\rm reg}(\Omega)=I(L^1_{loc}(\Omega))\subset \mathcal{D}^{\prime}(\Omega).$$
(3.6)

Note that under the identification of f and I f we have established the following chain of relations:

$$\mathcal{C}(\Omega)\subset L^r_{loc}(\Omega)\subseteq L^1_{loc}(\Omega) \subset \mathcal{D}^{\prime}(\Omega)$$

for any \(r\geq 1\), since for \(r>1\) the space of measurable functions f on Ω for which \(|f|^r\) is locally integrable is known to be contained in \(L^1_{loc}(\Omega)\).

2.2 Some Standard Examples of Distributions

2.2.1 Dirac’s Delta Distribution

For any point \(a \in \Omega \subset \mathbb{R}^n\) define a functional \(\delta_a:\mathcal{D}(\Omega) \to \mathbb K\) by

$$\delta_a(\phi)=\phi(a) \qquad \forall\; \phi\in \mathcal{D}(\Omega).$$

Obviously δ a is linear. For any compact subset \(K\subset \Omega\), one has the following estimate:

$$|\delta_a(\phi)|\leq C(a,K) p_{K,0}(\phi) \qquad \forall\; \phi \in \mathcal{D}_K(\Omega)$$

where the constant \(C(a,K)\) equals 1 if \(a \in K\) and \(C(a,K)=0\) otherwise. Therefore, the linear functional δ a is continuous on \(\mathcal{D}(\Omega)\) and thus a distribution. Its order obviously is zero. In the Exercises it is shown that δ a is not a regular distribution, i.e., there is no \(f \in L^1_{loc}(\Omega)\) such that \(\delta_a(\phi)=\int f(x) \phi(x){\textrm d} x\) for all \(\phi\in \mathcal{D}(\Omega)\).

2.2.2 Cauchy’s Principal Value

It is easy to see that \(x\mapsto \frac{1}{x}\) is not a locally integrable function on the real line \(\mathbb{R}\), hence \(I_{\frac{1}{x}}\) does not define a regular distribution. Nevertheless, one can define a distribution on \(\mathbb{R}\) which agrees with \(I_{\frac{1}{x}}\) on \(\mathbb{R}\backslash \left\{0\right\}\). This distribution is called Cauchy’s principal value and is defined by

$$\langle{\rm vp}\frac{1}{x}, \phi \rangle =\lim_{r\to 0}\int_{|x|\geq r} \frac{\phi(x)}{x} {\textrm d} x.$$
(3.7)

We have to show that this limit exists and that it defines a continuous linear functional on \(\mathcal{D}(\mathbb{R})\). For \(a>0\) consider the compact interval \(K=[\!-a,a]\). Take \(0<r <a\) and calculate, for all \(\phi \in \mathcal{D}_K(\mathbb{R})\),

$$\int_{|x|\geq r}\frac{\phi(x)}{x}{\textrm d} x =\int_r^a \frac{\phi(x)-\phi({-}x)}{x} {\textrm d} x.$$

If we observe that \(\phi(x)-\phi({-}x)=x\int_{-1}^{+1} \phi^{\prime}(xt){\textrm d} t\), we get the estimate

$$\left|\frac{\phi(x)-\phi({-}x)}{x}\right|\leq 2 \sup_{y \in K}|\phi^{\prime}(y)|\leq 2 p_{K,1}(\phi),$$

and thus \(|\int_r^a \frac{\phi(x)-\phi({-}x)}{x} dx|\leq 2a p_{K,1}(\phi)\) uniformly in \(0<r <a\), for all \(\phi \in \mathcal{D}_K(\mathbb{R})\). It follows that this limit exists and that it has the value:

$$\lim_{r \to 0}\int_{|x|\geq r} \frac{\phi(x)}{x}dx=\int_0^{\infty} \frac{\phi(x)-\phi({-}x)}{x} dx.$$

Furthermore, the continuity bound

$$|\langle{\rm vp}\frac{1}{x},\phi \rangle|\leq |K|p_{K,1}(\phi)$$

for all \(\phi \in \mathcal{D}_K(\mathbb{R})\) follows. Therefore, \({\rm vp}\frac{1}{x}\) is a well-defined distribution on \(\mathbb{R}\) according to Theorem 3.1. Its order obviously is 1.

The above proof gives the following convenient formula for Cauchy’s principal value:

$$\langle{\rm vp}\frac{1}{x}, \phi\rangle =\int_{0}^{\infty} \frac{\phi(x)-\phi({-}x)}{x}{\textrm d} x.$$
(3.8)

Test functions in \(\mathcal{D}(\mathbb{R}\backslash \left\{0\right\})\) have the property that they vanish in some neighborhood of the origin (depending on the function). Hence, for these test function the singular point x = 0 of \(\frac{1}{x}\) is avoided, and thus it follows that

$$\lim_{r\to 0}\int_{|x|\geq r}\frac{\phi(x)}{x}{\textrm d} x=\int_{\mathbb{R}}\frac{\phi(x)}{x}{\textrm d} x = \langle I_{\frac{1}{x}},\phi\rangle \qquad \forall \phi \in \mathcal{D}(\mathbb{R} \backslash\left\{0\right\}).$$

Sometimes one also finds the notation \({\rm vp}\int_{\mathbb{R}}\frac{\phi(x)}{x}{\textrm d} x\) for \(\langle{\rm vp}\frac{1}{x}, \phi\rangle\). The letters “vp” in the notation for Cauchy’s principal value stand for the original French name “valeur principale.”

2.2.3 Hadamard’s Principal Values

Closely related to Cauchy’s principal value is a family of distributions on \(\mathbb{R}\) which can be traced back to Hadamard. Certainly, for \(1<\beta<2\) the function \(\frac{1}{x^{\beta}}\) is not locally integrable on \({\mathbb R}^{+}\). We are going to define a distribution T on \({\mathbb R}^{+}\) which agrees on \({\mathbb R}^{+}\backslash\left\{0\right\}=(0,\infty)\) with the regular distribution \(I_{x^{-\beta}}\). For all \(\phi \in \mathcal{D}({\mathbb{R}})\) define

$$\langle T,\phi\rangle= \int_0^{\infty} \frac{\phi(x)-\phi(0)}{x^{\beta}}{\textrm d} x.$$

Since again \(\phi(x)-\phi(0)=x\int_0^1 \phi^{\prime}(xt){\textrm d} t\) we can estimate

$$|\frac{\phi(x)-\phi(0)}{x^{\beta}}|\leq |x|^{1-\beta} p_{K,1}(\phi)$$

if \(\phi \in \mathcal{D}_K(\mathbb{R})\). Since now the exponent \(\gamma=1-\beta\) is larger than −1, the integral exists over compact subsets. Hence, T is well defined on \(\mathcal{D}(\mathbb{R})\). Elementary properties of integrals imply that T is linear and the above estimate implies, as in the previous example, the continuity bound. Therefore, T is a distribution on \(\mathbb{R}\).

If \(\phi \in \mathcal{D}(\mathbb{R} \backslash \left\{0\right\})\), then in particular \(\phi(x)=0\) for all \(x \in \mathbb{R}\), \(|x|\leq r\) for some \(r>0\), and we get \(\langle T, \phi \rangle= \int_0^{\infty} \frac{\phi(x)}{x^{\beta}} {\textrm d} x =I_{\frac{1}{x^{\beta}}}(\phi)\). Hence, on \(\mathbb{R}\backslash \left\{0\right\}\) the distribution T is regular.

Distributions like Cauchy’s and Hadamard’s principal values are also called pseudo functions , since away from the origin x = 0 they coincide with the corresponding regular distributions. Thus, we can consider the pseudo functions as extensions of the regular distributions to the point x = 0.

3 Convergence of Sequences and Series of Distributions

Often the need arises to approximate given distributions by “simpler” distributions, for instance functions. For this one obviously needs a topology on the space \(\mathcal{D}^{\prime}(\Omega)\) of all distributions on a nonempty open set \(\Omega \subset \mathbb{R}^n\). A topology which suffices for our purposes is the so-called weak topology which is defined on \(\mathcal{D}^{\prime}(\Omega)\) by the system of seminorms \(\mathcal{P}_{\sigma}={\left\{\rho_{\phi}: \phi \in \mathcal{D}(\Omega)\right\}}\). Here \(\rho_{\phi}\) is defined by

$$\rho_{\phi}(T)=|\langle T,\phi\rangle|=|T(\phi)|\qquad \textrm{for all}\,T \in \mathcal{D}^{\prime}(\Omega).$$

This topology is usually denoted by \(\sigma\equiv \sigma(\mathcal{D}^{\prime},\mathcal{D})\).

If not stated explicitly otherwise we consider \(\mathcal{D}^{\prime}(\Omega)\) always equipped with this topology σ. Then, from our earlier discussions on HLCTVS, we know in principle what convergence in \(\mathcal{D}^{\prime}\) means or what a Cauchy sequence of distributions is. For clarity we write down these definitions explicitly.

Definition 3.3

Let \(\Omega \subset \mathbb{R}^n\) be open and nonempty and let \((T_j)_{j\in \mathbb N}\) be a sequence of distributions on Ω, i.e., a sequence in \(\mathcal{D}^{\prime}(\Omega)\). One says:

  1. 1.

    \((T_j)_{j\in \mathbb N}\) converges in \(\mathcal{D}^{\prime}(\Omega)\) if, and only if, there is a \(T \in \mathcal{D}^{\prime}(\Omega)\) such that for every \(\phi \in \mathcal{D}(\Omega)\) the numerical sequence \((T_j(\phi))_{j\in \mathbb N}\) converges in \(\mathbb K\) to \(T(\phi)\).

  2. 2.

    \((T_j)_{j\in \mathbb N}\) is a Cauchy sequence in \(\mathcal{D}^{\prime}(\Omega)\) if, and only if, for every \(\phi \in \mathcal{D}(\Omega)\) the numerical sequence \((T_j(\phi))_{j\in \mathbb N}\) is a Cauchy sequence in \(\mathbb K\).

Several simple examples will illustrate these definitions and how these concepts are applied to concrete problems. All sequences we consider here are sequences of regular distributions defined by sequences of functions which have no limit in the sense of functions.

Example 3.1

  1. 1.

    The sequence of \(\mathcal{C}^{\infty}\)-functions \(f_j(x)=\sin{jx}\) on \(\mathbb{R}\) certainly has no limit in the sense of functions. We claim that the sequence of regular distributions \(T_j=I_{f_j}\) defined by these functions converges in \(\mathcal{D}^{\prime}(\mathbb{R})\) to zero. For the proof take any \(\phi \in \mathcal{D}(\mathbb{R})\). A partial integration shows that

    $$\langle T_j,\phi \rangle=\int \sin{(jx)} \phi(x){\textrm d} x= \frac{1}{j}\int \cos{(jx)} \phi^{\prime}(x){\textrm d} x$$

    and we conclude that \(\lim_{j\to \infty}\langle T_j,\phi\rangle=0\).

  2. 2.

    Delta sequences: δ-sequences are sequences of functions which converge in \(\mathcal{D}^{\prime}\) to Dirac’s delta distribution. We present three examples of such sequences.

    1. a)

      Consider the sequence of continuous functions \(t_j(x)=\frac{\sin{(jx)}}{x}\) and denote \(T_j={I_{t_j}}\). Then

      $$\lim_{j\to \infty} T_j = \pi \delta\qquad{\rm in} \mathcal{D}^{\prime}(\mathbb{R}).$$

      For the proof take any \(\phi \in \mathcal{D}(\mathbb{R})\). Then the support of φ is contained in \([-a,a]\) for some \(a>0\). It follows that

      $$\begin{array}{ll} \langle T_j,\phi\rangle &= \int_{-a}^{+a} \frac{\sin{(jx)}}{x} \phi(x) {\textrm d} x\\ &=\int_{-a}^{+a} \frac{\sin{(jx)}}{x} [\phi(x) -\phi(0)]{\textrm d} x + \int_{-a}^{+a} \frac{\sin{(jx)}}{x} \phi(0){\textrm d} x.\end{array}.$$

      As in the first example, one shows that

      $$\int_{-a}^{+a} \frac{\sin{(jx)}}{x}[ \phi(x)-\phi(0)]{\textrm d} x= \frac{1}{j} \int_{-a}^{+a} \cos{(jx)} \frac{\textrm d}{{\textrm d} x}(\frac{\phi(x)-\phi(0)}{x}) {\textrm d} x$$

      converges to zero for \(j\to \infty\). Then recall the integral:

      $$\int_{-a}^{+a} \frac{\sin{(jx)}}{x} {\textrm d} x = \int_{-ja}^{+ja} \frac{\sin{y}}{y}{\textrm d} y \to_{j \to \infty} \int_{-\infty}^{+\infty} \frac{\sin{y}}{y} {\textrm d} y =\pi.$$

      We conclude that \(\lim_{j\to \infty}\langle T_j,\phi \rangle= \pi \phi(0)\) for every \(\phi \in \mathcal{D}(\mathbb{R})\) which proves the statement.

    2. b)

      Take any nonnegative function \(f \in L^1(\mathbb{R}^n)\) with \(\int_{\mathbb{R}^n}f(x){\textrm d} x =1\). Introduce the sequence of functions \(f_j(x)=j^n f(jx)\) and the associated sequence of regular distributions \(T_j=I_{f_j}\). We claim:

      $$\lim_{j\to \infty} T_j = \delta \qquad{\rm in}\;\mathcal{D}^{\prime}(\mathbb{R}^n).$$

      The proof is simple. Take any \(\phi \in \mathcal{D}(\mathbb{R}^n)\) and calculate as above,

      $$\begin{array}{ll} \langle T_j,\phi\rangle &= \int_{\mathbb{R}^n} f_j(x)\phi(x) {\textrm d} x\\ &=\int_{\mathbb{R}^n} f_j(x)[\phi(x) - \phi(0)] {\textrm d} x + \int_{\mathbb{R}^n} f_j(x)\phi(0){\textrm d} x.\end{array}$$

      To the first term

      $$\begin{aligned} \int_{\mathbb{R}^n} f_j(x)[\phi(x) - \phi(0)] {\textrm d} x&=\int_{\mathbb{R}^n} j^n f(jx)[\phi(x) - \phi(0)] {\textrm d} x\\ &=\int_{\mathbb{R}^n} f(y)[\phi(\frac{y}{j}) - \phi(0)]{\textrm d} y\end{aligned}$$

      we apply Lebesgue’s dominated convergence theorem to conclude that the limit \(j \to \infty\) of this term vanishes. For the second term note that \(\int_{\mathbb{R}^n}f_j(x){\textrm d} x =\int_{\mathbb{R}^n} f(y){\textrm d} y=1\) for all \(j \in \mathbb N\) and we conclude.

      As a special case of this result we mention that we can take in particular \(f\in \mathcal{D}(\mathbb{R}^n)\). This then shows that Dirac’s delta distribution is the limit in \(\mathcal{D}^{\prime}\) of a sequence of \(\mathcal{C}^{\infty}\)-functions of compact support.

    3. c)

      For the last example of a delta sequence we start with the Gauss function on \(\mathbb{R}^n\): \(g(x)=(\pi)^{-\frac{n}{2}}\textrm{e}^{-x^2}\). Certainly \(0\leq g \in L^1(\mathbb{R}^n)\) and thus we can proceed as in the previous example. The sequence of scaled Gauss functions \(g_j(x)=j^n g(jx)\) converges in the sense of distributions to Dirac’s delta distribution, i.e., for every \(\phi \in \mathcal{D}(\mathbb{R}^n)\):

      $$\lim _{j\to \infty}\langle I_{g_j},\phi \rangle =\phi(0)=\langle \delta, \phi\rangle.$$

      This example shows that Dirac’s delta can also be approximated by a sequence of strongly decreasing \(\mathcal{C}^{\infty}\)-functions.

  3. 3.

    Now we prove the Breit–Wigner formula. For each \(\varepsilon>0\) define a function \(f_{\varepsilon} \to \mathbb{R}\) by

    $$f_{\varepsilon}(x)=\frac{\varepsilon}{x^2 +\varepsilon^2}= {\rm Im}\frac{1}{x-\textrm{i} \varepsilon}= \frac{\textrm{i}}{2}\left[\frac{1}{x+\textrm{i} \varepsilon} -\frac{1}{x-\textrm{i} \varepsilon}\right].$$

    We claim that

    $$\lim _{\varepsilon \to 0} I_{f_{\varepsilon}}=\pi \delta\qquad{\rm in}\;\;\mathcal{D}^{\prime}(\mathbb{R}).$$
    (3.9)

    Often this is written as

    $$\lim _{\varepsilon \to 0} \frac{\varepsilon}{x^2 + \varepsilon^2} = \pi \delta$$

    (Breit–Wigner formula).

    This is actually a special case of a delta sequence: The function \(h(x)= \frac{1}{1+x^2}\) satisfies \(0\leq h \in L^1(\mathbb{R})\) and \(\int_{\mathbb{R}} h(x){\textrm d} x=\pi\). Thus, one can take \(h_j(x)=jh(jx)=f_{\varepsilon}(x)\) for \(\varepsilon=\frac{1}{j}\) and apply the second result on delta sequences..

  4. 4.

    Closely related to the Breit–Wigner formula is the Sokhotski–Plemelji formula. It reads

    $$\lim _{\varepsilon \to 0}\frac{1}{x\pm \textrm{i} \varepsilon} = \mp \textrm{i} \pi \delta + {\rm vp}\,\frac{1}{x} \qquad{\rm in}\;\;\mathcal{D}^{\prime}(\mathbb{R}).$$
    (3.10)

    Both formulas are used quite often in quantum mechanics.

    For any \(\varepsilon>0\) we have

    $$\frac{1}{x\pm\textrm{i} \varepsilon}={\rm Re} \frac{1}{x\pm \textrm{i} \varepsilon} +\textrm{i} \,{\rm Im}\frac{1}{x\pm\textrm{i} \varepsilon}$$

    where

    $$\begin{aligned} {\rm Re}\frac{1}{x\pm \textrm{i} \varepsilon}&=\frac{x}{x^2 +\varepsilon^2} \equiv g_{\varepsilon}(x),\\ {\rm Im}\frac{1}{x\pm\textrm{i} \varepsilon}&=\mp \frac{\varepsilon}{x^2+\varepsilon^2} \equiv \mp f_{\varepsilon}(x).\end{aligned}$$

    The limit of \(f_{\varepsilon}\) for \(\varepsilon \to 0\) has been determined for the Breit–Wigner formula. To find the same limit for the functions \(g_{\varepsilon}\) note first that \(g_{\varepsilon}\) is not integrable on \(\mathbb{R}\). It is only locally integrable. Take any \(\phi \in \mathcal{D}(\mathbb{R})\) and observe that the functions \(g_{\varepsilon}\) are odd. Thus, we get

    $$\langle I_{g_{\varepsilon}},\phi\rangle=\int_{\mathbb{R}} g_{\varepsilon}(x)\phi(x){\textrm d} x=\int_0^{\infty} g_{\varepsilon}(x)[\phi(x)-\phi({-}x)]{\textrm d} x.$$

    Rewrite the integrand as

    $$g_{\varepsilon}(x)[\phi(x)-\phi({-}x)]=xg_{\varepsilon}(x) \frac{\phi(x)-\phi({-}x)}{x}$$

    and observe that the function \(\frac{\phi(x)-\phi({-}x)}{x}\) belongs to \(L^1(\mathbb{R})\) while the functions \(xg_{\varepsilon}(x)\) are bounded on \(\mathbb{R}\) by 1 and converge, for \(x \neq0\), pointwise to 1 as \(\varepsilon \to 0\). Lebesgue’s dominated convergence theorem thus implies that

    $$\lim _{\varepsilon \to 0} \int_{\mathbb{R}} g_{\varepsilon}(x)\phi(x){\textrm d} x=\int_0^{\infty} \frac{\phi(x)-\phi({-}x)}{x}{\textrm d} x,$$

    or

    $$\lim _{\varepsilon \to 0} \frac{x}{x^2 +\varepsilon^2}={\rm vp}\frac{1}{x} \qquad{\rm in}\;\;\mathcal{D}^{\prime}(\mathbb{R})$$
    (3.11)

    where we have taken Eq. (3.8) into account. Equation (3.11) and the Breit–Wigner formula together imply easily the Sokhotski–Plemelj formula.

These concrete examples illustrate various practical aspects which have to be addressed in the proof of convergence of sequences of distributions. Now we formulate a fairly general and powerful result which simplifies the convergence proofs for sequences of distributions in an essential way: It says that for the convergence of a sequence of distributions, it suffices to show that this sequence is a Cauchy sequence, i.e., the space of distributions equipped with the weak topology is sequentially complete. Because of the great importance of this result we present a detailed proof.

Theorem 3.3

Equip the space of distributions \(\mathcal{D}^{\prime}(\Omega)\) on an open nonempty set \(\Omega \subset \mathbb{R}^n\) with the weak topology \(\sigma=\sigma(\mathcal{D}^{\prime}(\Omega), \mathcal{D}(\Omega))\) . Then \(\mathcal{D}^{\prime}(\Omega)\) is a sequentially complete HLCTVS.

In particular, for any sequence \((T_i)_{i\in\mathbb N}\subset \mathcal{D}^{\prime}(\Omega)\) such that for each \(\phi \in \mathcal{D}(\Omega)\) the numerical sequence \((T_i(\phi))_{i\in\mathbb N}\) converges, there are, for each compact subset \(K \subset \Omega\) , a constant C and an integer \(m \in\mathbb N\) such that

$$|T_i(\phi)|\leq C p_{K,m}(\phi)\qquad \forall\, \phi \in \mathcal{D}_K(\Omega),\;\forall i\in\mathbb N;$$
(3.12)

i.e., the sequence \((T_i)_{i\in\mathbb N}\) is equicontinuous on \(\mathcal{D}_K(\Omega)\) for each compact set \(K \subset \Omega\) .

Proof

Since its topology is defined in terms of a system of seminorms, the space of all distributions on Ω is certainly a locally convex topological vector space. Now given \(T \in \mathcal{D}^{\prime}(\Omega)\), \(T\neq 0\), there is a \(\phi \in \mathcal{D}(\Omega)\) such that \(T(\phi)\neq 0\), thus \(p_{\phi}(T)=|T(\phi)|>0\) and Proposition 2.2 implies that the weak topology is Hausdorff, hence \(\mathcal{D}^{\prime}(\Omega)\) is an HLCTVS.

In order to prove sequential completeness, we take any Cauchy sequence \((T_i)_{i\in \mathbb N}\) in \(\mathcal{D}^{\prime}(\Omega)\) and construct an element \(T \in \mathcal{D}^{\prime}(\Omega)\) to which this sequence converges.

For any \(\phi \in \mathcal{D}(\Omega)\) we know (by definition of a Cauchy sequence) \((T_i(\phi))_{i\in \mathbb N}\) to be a Cauchy sequence in the field \(\mathbb K\) which is complete. Hence, this Cauchy sequence of numbers converges to some number which we call \(T(\phi)\). Since this argument applies to any \(\phi \in \mathcal{D}(\Omega)\), we can define a function \(T:\mathcal{D}(\Omega) \to \mathbb K\) by

$$T(\phi)= \lim _{i \to \infty} T_i(\phi) \qquad \forall\, \phi \in \mathcal{D}(\Omega).$$

Since each T i is linear, basic rules of calculation for limits of convergent sequences of numbers imply that the limit function T is linear too.

In order to show continuity of this linear functional T it suffices, according to Theorem, to show that \(T_K=T|\mathcal{D}_K(\Omega)\) is continuous on \(\mathcal{D}_K(\Omega)\) for every compact subset \(K\subset \Omega\). This is done by constructing a neighborhood U of zero in \(\mathcal{D}_K(\Omega)\) on which T is bounded and by using Corollary 2.1 to deduce continuity.

Since T i is continuous on \(\mathcal{D}_K(\Omega)\), we know that

$$U_i=\left\{\phi \in \mathcal{D}_K(\Omega):\, |T_i(\phi)|\leq 1\right\}$$

is a closed absolutely convex neighborhood of zero in \(\mathcal{D}_K(\Omega)\) (see also the Exercises). Now define

$$U=\cap_{i=1}^{\infty} U_i$$

and observe that U is a closed absolutely convex set on which the functional T is bounded by 1. Hence, in order to deduce continuity of T, one has to show that U is actually a neighborhood of zero in \(\mathcal{D}_K(\Omega)\). This part is indeed the core of the proof which relies on some fundamental properties of the space \(\mathcal{D}_K(\Omega)\) which are proven in the Appendix.

Take any \(\phi \in \mathcal{D}_K(\Omega)\); since the sequence \((T_i(\phi))_{i\in \mathbb N}\) converges, it is bounded and there is an \(n=n(\phi) \in \mathbb N\) such that \(|T_i(\phi)|\leq n\) for all \(i \in \mathbb N\). It follows that \(|T(\phi)|=\lim_{i\to \infty} |T_i(\phi)| \leq n\) and thus \(\phi =n\cdot \frac{1}{n}\phi \in nU\). Since φ was arbitrary in \(\mathcal{D}_K(\Omega)\), this proves

$$\mathcal{D}_K(\Omega)=\cup_{n=1}^{\infty} nU.$$

In Proposition 2.4 it is shown that \(\mathcal{D}_K(\Omega)\) is a complete metrizable HLCTVS. Hence the theorem of Baire (see Appendix, Theorem C.3) applies to this space, and it follows that one of the sets nU and hence U itself must have a nonempty interior. This means that some open ball \(B=\phi_0 + B_{p,r}\equiv\phi_0 +\left\{\phi \in \mathcal{D}_K(\Omega): p(\phi)<r\right\}\) is contained in the set U. Here φ0 is some element in U, r some positive number and \(p=p_{K,m}\) is some continuous seminorm of the space \(\mathcal{D}_K(\Omega)\). Since T is bounded on U by 1 it is bounded on the neighborhood of zero \(B_{p,r}\) by \(1+|T(\phi_0)|\) and thus T is continuous.

All elements of T i and the limit element T are bounded on this neighborhood U by 1. From the above it follows that there are a constant C and some integer \(m\in\mathbb N\) such that

$$|T_i(\phi)|\leq C p_{K,m}(\phi)\qquad \forall\, \phi \in \mathcal{D}_K(\Omega),\;\forall i\in\mathbb N;$$

i.e., the sequence \((T_i)_{i\in\mathbb N}\) is equicontinuous on \(\mathcal{D}_K(\Omega)\) for each compact set \(K \subset \Omega\), and we conclude. □

The convergence of a series of distributions is defined in the usual way through convergence of the corresponding sequence of partial sums. This can easily be translated into the following concrete formulation.

Definition 3.4

Given a sequence \((T_i)_{i\in \mathbb N}\) of distributions on a nonempty open set \(\Omega \subset \mathbb{R}^n\) one says that the series \(\sum_{i \in \mathbb N} T_i\) converges if, and only if, there is a \(T\in \mathcal{D}^{\prime}(\Omega)\) such that for every \(\phi \in \mathcal{D}(\Omega)\) the numerical series \(\sum_{i\in \mathbb N}T_i(\phi)\) converges to the number \(T(\phi)\).

As a first important application of Theorem 3.3, one has a rather convenient characterization of the convergence of a series of distributions.

Corollary 3.2

A series \(\sum_{i\in\mathbb N} T_i\) of distributions \(T_i \in \mathcal{D}^{\prime}(\Omega)\) converges if, and only if, for every \(\phi\in\mathcal{D}(\Omega)\) the numerical series \(\sum_{i\in\mathbb N}T_i(\phi)\) converges.

As a simple example consider the distributions \(T_i = c_i\delta_{ia}\) for some \(a>0\) and any sequence of numbers c i . Then the series

$$\sum_{i\in\mathbb N} c_i \delta_{ia}$$

converges in \(\mathcal{D}^{\prime}(\mathbb{R})\). The proof is simple. For every \(\phi\in \mathcal{D}(\mathbb{R})\) one has

$$\sum_{i\in\mathbb N}T_i(\phi)=\sum_{i\in\mathbb N}c_i \phi(ia)=\sum_{i=1}^m c_i \phi(ia)$$

for some \(m\in \mathbb N\) depending on the support of the test function φ (for ia > m the point ia is not contained in \({\rm supp}\,\phi\)).

4 Localization of Distributions

Distributions on a nonempty open set \(\Omega \subset \mathbb{R}^n\) have been defined as continuous linear functionals on the test function space \(\mathcal{D}(\Omega)\) over Ω but not directly in points of Ω. Nevertheless we consider these distributions to be localized. In this section we explain in which sense this localization is understood.

Suppose \(\Omega_1 \subset \Omega_2 \subset \mathbb{R}^n\). Then every test function \(\phi \in \mathcal{D}(\Omega_1)\) vanishes in a neighborhood of the boundary of Ω1 and thus can be continued by 0 to Ω2 to give a compactly supported test function \(i_{\Omega_2,\Omega_1}(\phi)\) on Ω2. This defines a mapping \(i_{\Omega_2,\Omega_1}: \mathcal{D}(\Omega_1) \to \mathcal{D}(\Omega_2)\) which is evidently linear and continuous. Thus, we can consider \(\mathcal{D}(\Omega_1)\) to be embedded into \(\mathcal{D}(\Omega_2)\) as \(i_{\Omega_2,\Omega_1}(\mathcal{D}(\Omega_1))\), i.e.

$$i_{\Omega_2,\Omega_1}(\mathcal{D}(\Omega_1))\subset \mathcal{D}(\Omega_2).$$

Hence, every continuous linear functional T on \(\mathcal{D}(\Omega_2)\) defines also a continuous linear functional \(T\circ i_{\Omega_2,\Omega_1}\equiv \rho_{\Omega_1,\Omega_2}(T)\) on \(\mathcal{D}(\Omega_1)\). Therefore, every distribution T on Ω2 can be restricted to any open nonempty subset Ω1 by

$$T|\Omega_1 = \rho_{\Omega_1,\Omega_2}(T).$$
(3.13)

In particular this allows us to express the fact that a distribution T on Ω2 vanishes on an open subset Ω1: \(\rho_{\Omega_1,\Omega_2}(T)=0\), or in concrete terms

$$T\circ i_{\Omega_2,\Omega_1}(\phi)=0\qquad \forall\,\phi \in \mathcal{D}(\Omega_1).$$

For convenience of notation the trivial extension map \(i_{\Omega_2,\Omega_1}\) is usually omitted and one writes

$$T(\phi)=0\qquad \forall\, \phi\in\mathcal{D}(\Omega_1)$$

to express the fact that a distribution T on Ω2 vanishes on the open subset Ω1. As a slight extension we state: Two distributions T 1 and T 2 on Ω2 agree on an open subset Ω1 if, and only if,

$$\rho_{\Omega_1,\Omega_2}(T_1)=\rho_{\Omega_1,\Omega_2}(T_2)$$

or in more convenient notation if, and only if,

$$T_1(\phi)=T_2(\phi)\qquad \forall\,\phi \in \mathcal{D}(\Omega_1).$$

The support of a function \(f:\Omega\to \mathbb K\) is defined as the closure of the set of those points in which the function does not vanish, or equivalently as the complement of the largest open subset of Ω on which f vanishes. The above preparations thus allow us to define the support of a distribution T on Ω as the complement of the largest open subset \(\Omega_1 \subset \Omega\) on which T vanishes. The support of T is denoted by supp T. It is characterized by the formula

$${\rm supp}\,T= \bigcap_{A\in C_T} A$$
(3.14)

where C T denotes the set of all closed subsets of Ω such that T vanishes on \(\Omega \backslash{A}\). Accordingly a point \(x \in \Omega\) belongs to the support of the distribution T on Ω if, and only if, T does not vanish in every open neighborhood U of x, i.e., for every open neighborhood U of x there is a \(\phi \in \mathcal{D}(U)\) such that \(T(\phi)\neq 0\).

In the Exercises one shows that this concept of support of distributions is compatible with the embedding of functions and the support defined for functions, i.e., one shows

$$\rm supp\,I_f ={\rm supp}\, f \qquad \textrm{for all}\,f\in L^1_{loc}(\Omega).$$

A simple example shows that distributions can have a support consisting of one point: The support of the distribution T on Ω defined by

$$T(\phi)=\sum_{|\alpha|\leq m} c_{\alpha} D^{\alpha}\phi(x_0)$$
(3.15)

is the point \(x_0 \in \Omega\), for any choice of the constants \(c_{\alpha}\) and any \(m \in \mathbb N\). If a distribution is of the form (3.15) then certainly \(T(\phi)=0\) for all \(\phi \in \mathcal{D}(\mathbb{R}^n\backslash\left\{x_0\right\})\) since such test functions vanish in a neighborhood of x 0 and thus all derivatives vanish there. And, if not all coefficients \(c_{\alpha}\) vanish, there are, in any neighborhood U of the point x 0, test functions \(\phi\in \mathcal{D}(U)\) such that \(T(\phi)\neq 0\). This claim is addressed in the Exercises.

Furthermore, this formula actually gives the general form of a distribution whose support is the point x 0. We show this later in Proposition 4.7.

Since we have learned above when two distributions on Ω agree on an open subset, we know in particular when a distribution is equal to a \(\mathcal{C}^{\infty}\)-function, or more precisely when a distribution is equal to the regular distribution defined by a \(\mathcal{C}^{\infty}\)-function, on some open subset. This is used in the definition of the singular support of a distribution, which seems somewhat ad hoc but which has proved itself to be quite useful in the analysis of constant coefficient partial differential operators.

Definition 3.5

Let T be a distribution on a nonempty open set \(\Omega\subset \mathbb{R}^n\). The singular support of \(T\), denoted sing supp T, is the smallest closed subset of Ω in the complement of which T is equal to a \(\mathcal{C}^{\infty}\)-function.

We mention a simple one dimensional example, Cauchy’s principal value \({\rm vp}\,\frac{1}{x}\). In the discussion following formula (3.8) we saw that \({\rm vp}\,\frac{1}{x}=I_{\frac{1}{x}}\) on \(\mathbb{R}\backslash\left\{0\right\}\). Since \(\frac{1}{x}\) is a \(\mathcal{C}\)-function on \(\mathbb{R}\backslash\left\{0\right\}\), \({\rm sing\,supp\,vp}\,\frac{1}{x} \subseteq \left\{0\right\}\). And since \(\left\{0\right\}\) is obviously the smallest closed subset of \(\mathbb{R}\) outside which the Cauchy principal value is equal to a \(\mathcal{C}\)-function, it follows that

$$\rm sing\,supp\,vp\,\frac{1}{x}=\left\{0\right\}.$$

5 Tempered Distributions and Distributions with Compact Support

Tempered distributions are distributions which admit the Fourier transform as an isomorphism of topological vector spaces and accordingly we will devote later a separate chapter to Fourier transformation and tempered distributions. This section just gives the basic definitions and properties of tempered distributions and distributions with compact support.

Recall the beginning of the section on the definition of distributions. What has been done there for general distributions will be done here for the subclasses of tempered and compactly supported distributions.

Definition 3.6

A tempered distribution T on an open nonempty subset \(\Omega \subset \mathbb{R}^n\) is a continuous linear functional on the test function space \({\mathcal S}(\Omega)\) of strongly decreasing \(\mathcal{C}^{\infty}\)-functions on Ω. The set of all tempered distributions on Ω equals the topological dual \({\mathcal S}^{\prime}(\Omega)\) of \({\mathcal S}(\Omega)\).

In analogy with Theorem 3.1, we have the following explicit characterization of tempered distributions.

Theorem 3.4

A linear functional \(T:{\mathcal S}(\Omega) \to \mathbb K\) is a tempered distribution on the open nonempty set \(\Omega\subset \mathbb{R}^n\) if, and only if, there exist a number \(C\in{\mathbb R}^{+}\) and natural numbers \(m,k \in\mathbb N\) , depending on T, such that for all \(\phi \in{\mathcal S}(\Omega)\) the estimate

$$|T(\phi)|\leq C p_{m,k}(\phi)$$
(3.16)

holds.

Proof

Recall the definition of the filtering system of norms of the space \({\mathcal S}(\Omega)\) and the condition of boundedness for a linear function \(T:{\mathcal S}(\Omega) \to \mathbb K\). Then it is clear that the above estimate characterizes T as being bounded on \({\mathcal S}(\Omega)\). Thus, by Theorem 2.4, this estimate characterizes continuity and we conclude. □

According to relation (3.2), we know that every tempered distribution is a distribution and therefore all results established for distributions apply to tempered distributions. Also, the basic definitions of convergence and of a Cauchy sequence are formally the same as soon as we replace the test function space \(\mathcal{D}(\Omega)\) by the smaller test function space \({\mathcal S}(\Omega)\) and the topological dual \(\mathcal{D}^{\prime}(\Omega)\) of \(\mathcal{D}(\Omega)\) by the topological dual \({\mathcal S}^{\prime}(\Omega)\) of \({\mathcal S}(\Omega)\). Hence, we do not repeat these definitions, but we formulate the important counterpart of Theorem 3.3 explicitly.

Theorem 3.5

Equip the space of distributions \({\mathcal S}^{\prime}(\Omega)\) of tempered distributions on an open nonempty set \(\Omega \subseteq \mathbb{R}^n\) with the weak topology \(\sigma=\sigma({\mathcal S}^{\prime}(\Omega), {\mathcal S}(\Omega))\) . Then \({\mathcal S}^{\prime}(\Omega)\) is a sequentially complete HLCTVS.

Proof

As in the proof of Theorem one sees that \({\mathcal S}^{\prime}(\Omega)\) is an HLCTVS. By this theorem one also knows that a Cauchy sequence in \({\mathcal S}^{\prime}(\Omega)\) converges to some distribution T on Ω. In order to show that T is actually tempered, one proves that T is bounded on some open ball in \({\mathcal S}(\Omega)\). Since \({\mathcal S}(\Omega)\) is a complete metrizable space this can be done as in the proof of Theorem 3.3. Thus we conclude. □

Finally, we discuss briefly the space of distributions of compact support. Recall that a distribution \(T \in \mathcal{D}^{\prime}(\Omega)\) is said to have a compact support if there is a compact set \(K \subset \Omega\) such that \(T(\phi)=0\) for all \(\phi \in \mathcal{D}(\Omega\backslash K)\). The smallest of the compact subsets K for which this condition holds is called the support of T, denoted by \({\rm supp} T\). As we are going to explain now, distributions of compact support can be characterized topologically as elements of the topological dual of the test function space \(\mathcal{E}(\Omega)\). According to (C.3), the space \(\mathcal{E}(\Omega)\) is the space \(\mathcal{C}^{\infty}(\Omega)\) equipped with the filtering system of semi-norms \(\mathcal{P}_{\infty}(\Omega)=\left\{p_{K,m}: K\subset \Omega\;{\rm compact},m=0,1,2,\ldots\right\}\). Hence a linear function \(T:\mathcal{E}(\Omega) \to \mathbb K\) is continuous if, and only if, there are a compact set \(K\subset \Omega\), a constant \(C\in{\mathbb R}^{+}\) and an integer m such that

$$|T(\phi)|\leq C p_{K,m}(\phi) \qquad \forall\, \phi \in \mathcal{E}(\Omega).$$
(3.17)

Now suppose \(T \in \mathcal{E}^{\prime}(\Omega)\) is given. Then T satisfies condition (3.17) and by relation (3.2) we know that T is a distribution on Ω. Take any \(\phi \in \mathcal{D}(\Omega \rightarrow K)\). Then φ vanishes in some open neighborhood U of K and thus \(D^{\alpha}\phi(x)=0\) for all \(x \in K\) and all \(\alpha \in \mathbb N^n\). It follows that \(p_{K,m}(\phi)=0\) and thus \(T(\phi)=0\) for all \(\phi \in \mathcal{D}(\Omega\backslash K)\), hence \({\rm supp} T\subseteq K\). This shows that elements in \(\mathcal{E}^{\prime}(\Omega)\) are distributions with compact support.

Conversely, suppose that \(T \in \mathcal{D}^{\prime}(\Omega)\) has a support contained in a compact set \(K\subset \Omega\). There are functions \(u\in \mathcal{D}(\Omega)\) which are equal to 1 in an open neighborhood of K and which have their support in a slightly larger compact set \(K^{\prime}\) (see Exercises). It follows that \((1-u)\cdot \phi \in \mathcal{D}(\Omega\backslash K)\) and therefore \(T((1-u)\cdot \phi)=0\) or \(T(\phi)=T(u \cdot \phi)\) for all \(\phi \in \mathcal{D}(\Omega)\). For any \(\psi \in \mathcal{E}(\Omega)\), one knows \(u\cdot \psi \in \mathcal{D}_{K^{\prime}}(\Omega)\) and thus \(T_0(\psi)=T(u\cdot \psi)\) is a well-defined linear function \(\mathcal{E}(\Omega) \to \mathbb K\). (If \(v \in \mathcal{D}(\Omega)\) is another function which is equal to 1 in some open neighborhood of K, then \(u\cdot \psi - v \cdot \psi \in \mathcal{D}(\Omega\backslash K)\) and therefore \(T(u\cdot \psi - v \cdot \psi)=0\)). Since T is a distribution, there are a constant \(C\in{\mathbb R}^{+}\) and \(m\in \mathbb N\) such that \(|T(\phi)|\leq c p_{K^{\prime},m}(\phi)\) for all \(\phi \in \mathcal{D}_{K^{\prime}}(\Omega)\). For all \(\psi \in \mathcal{E}(\Omega)\) we thus get

$$|T_0(\psi)| =|T(u \cdot \psi)| \leq C p_{K^{\prime},m}(u\cdot \psi)\leq C p_{K^{\prime},m}(u) p_{K^{\prime},m}(\psi).$$

This shows that T 0 is continuous on \(\mathcal{E}(\Omega)\), i.e. \(T_0\in \mathcal{E}^{\prime}(\Omega)\). On \(\mathcal{D}(\Omega)\) the functionals T 0 and T agree: \(T_0(\phi)=T(u\cdot \phi) = T(\phi)\) for all \(\phi \in \mathcal{D}(\Omega)\) as we have seen above and therefore we can formulate the following result.

Theorem 3.6

The topological dual \(\mathcal{E}^{\prime}(\Omega)\) of the test function space \(\mathcal{E}(\Omega)\) equals the space of distributions on Ω which have a compact support. Equipped with the weak topology \(\sigma= \sigma(\mathcal{E}^{\prime}(\Omega),\mathcal{E}(\Omega))\) the space \(\mathcal{E}^{\prime}(\Omega)\) of distributions with compact support is a sequentially complete HLCTVS.

Proof

The proof that \(\mathcal{E}^{\prime}(\Omega)\) is a sequentially complete HLCTVS is left as an exercise. The other statements have been proven above. □

6 Exercises

  1. 1.

    Let \(f:\Omega\to \mathbb{R}\) be a continuous function on an open nonempty set \(\Omega\subset \mathbb{R}^n\). Show: If \(\int f(x)\phi(x){\textrm d} x =0\) for all \(\phi \in \mathcal{D}(\Omega)\), then f = 0, i.e., the map \(I:\mathcal{C}(\Omega)\to \mathcal{D}^{\prime}(\Omega)\) of Theorem 3.2 is injective. Deduce that I is injective on all of \(L^1_{loc}(\Omega)\).

  2. 2.

    Prove: There is no \(f \in L^1_{loc}(\Omega)\) such that \(\delta_a(\phi)=\int f(x)\phi(x){\textrm d} x\) for all \(\phi\in \mathcal{D}(\Omega)\).

    Hint: It suffices to consider the case a = 0. Then take the function \(\rho: \mathbb{R}^n \to \mathbb{R}\) by

    $$\rho(x)=\left\{\begin{array}{@{}r@{\quad:\quad}l} 0 & {\rm for} |x|\geq 1,\\ \textrm{e}^{\frac{-1}{1- x^2}} & {\rm for} |x|<1,\end{array} \right.$$

    and define \(\rho_r(x)=\rho(\frac{x}{r})\) for \(r>0\). Recall that \(\rho_r \in \mathcal{D}(\Omega)\) and \(\rho_r(x)=0\) for all \(x \in \mathbb{R}^n\) with \(|x|> r\). Finally, observe that for \(f \in L^1_{loc}(\Omega)\) one has

    $$\lim _{r\to 0} \int_{ x: |x|\leq r} |f(x)|{\textrm d} x =0.$$
  3. 3.

    Consider the hyperplane \(H={\left\{ x=(x_1,\ldots,x_n) \in \mathbb{R}^n: x_1=0\right\}}\). Define a function \(\delta_H: \mathcal{D}(\mathbb{R}^n) \to \mathbb K\) by

    $$\langle \delta_H,\phi\rangle =\int_{\mathbb{R}^{n-1}} \phi(0,x_2,\ldots,x_n){\textrm d} x_2 \cdots{\textrm d} x_n\qquad \forall\, \phi \in \mathcal{D}(\mathbb{R}^n).$$

    Show that δ H is a distribution on \(\mathbb{R}^n\). It is called Dirac’s delta distribution on the hyperplane \(H\).

  4. 4.

    For any point \(a \in \Omega \subset \mathbb{R}^n\), Ω open and not empty, define a functional \(T:\mathcal{D}(\Omega) \to \mathbb K\) by

    $$\langle T,\phi \rangle =\sum_{i=1}^n \frac{\partial^2 \phi}{\partial x_i^2}(a)\equiv \triangle \phi(a) \qquad \forall \, \phi \in \mathcal{D}(\mathbb{R}^n).$$

    Prove: T is a distribution on Ω of order 2. On \(\Omega\backslash\left\{a\right\}\) this distribution is equal to the regular distribution I 0 defined by the zero function.

  5. 5.

    Let \(S_{n-1}={\left\{ x \in \mathbb{R}^n:\, \sum_{i=1}^n x_i^2=1\right\}}\) be the unit sphere in \(\mathbb{R}^n\) and denote by \({\textrm d} \sigma\) the uniform measure on \(S_{n-1}\). The derivative in the direction of the outer normal of \(S_{n-1}\) is denoted by \(\frac{\partial}{\partial n}\). Now define a function \(T:\mathcal{D}(\mathbb{R}^n) \to \mathbb K\) by

    $$\langle T, \phi\rangle=\int_{S_{n-1}} \frac{\partial \phi}{\partial n} {\textrm d} \sigma \qquad \forall \, \phi \in \mathcal{D}(\Omega)$$

    and show that T is a distribution on \(\mathbb{R}^n\) of order 1 which is equal to the regular distribution I 0 on \(\mathbb{R}^n\backslash S_{n-1}\).

  6. 6.

    Given a Cauchy sequence \((T_i)_{i\in \mathbb N}\) of distributions on a nonempty open set \(\Omega\subset \mathbb{R}^n\), prove in detail that the (pointwise or weak) limit T is a linear function \(\mathcal{D}(\Omega)\to \mathbb K\).

  7. 7.

    Let \(X[\mathcal{P}]\) be an HLCTVS, \(T \in X^{\prime}[\mathcal{P}]\) and \(r>0\). Show:

    $$U=\left\{x\in X:\,|T(x)|\leq r\right\}$$

    is a closed absolutely convex neighborhood of zero.