Abstract
Based on Lindgren and Salomonson’s analysis on Fréchet differentiability [Phys Rev A 67:056501 (2003)], we showed a specific variational path along which the Fréchet derivative of the Levy-Lieb functional does not exist in the unnormalized density domain. This conclusion still holds even when the density is restricted within a normalized space. Furthermore, we extended our analysis to the Lieb functional and demonstrated that the Lieb functional is not Fréchet differentiable. Along our proposed variational path, the Gâteaux derivative of the Levy-Lieb functional or the Lieb functional takes a different form from the corresponding one along other more conventional variational paths. This fact prompted us to define a new class of unconventional density variations and inspired us to present a modified density variation domain to eliminate the problems associated with such unconventional density variations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Parr RG, Yang W (1989) Density-functional theory of atoms and molecules. Oxford University Press, New York
Hohenberg P, Kohn W (1964) Phys Rev 136:B864
Kohn W, Sham LJ (1965) Phys Rev 140:A1133
Wang YA, Xiang P (2013) In: Wesolowski TA, Wang YA (eds) Recent advances in orbital-free density functional theory, Chap. 1. World Scientific, Singapore, pp 3–12
Lieb EH (1983) Int J Quantum Chem 24:243
Englisch H, Englisch R (1983) Phys Stat Sol 123:711
Englisch H, Englisch R (1984) Phys Stat Sol 124:373
Lindgren I, Salomonson S (2003) Phys Rev A 67:056501
Lindgren I, Salomonson S (2003) Adv Quantum Chem 43:95
Lindgren I, Salomonson S (2004) Phys Rev A 70:032509
Ekeland I, Temam R (1976) Convex analysis and variational problems. North-Holland, Amsterdam
Harris J, Jones RO (1974) J Phys F 4:1170
Harris J (1984) Phys Rev A 29:1648
Gunnarsson O, Lundqvist BI (1976) Phys Rev B 13:4274
Langreth DC, Perdew JP (1980) Phys Rev B 21:5469
Wang YA (1997) Phys Rev A 55:4589
Wang YA (1997) Phys Rev A 56:1646
Levy M (1979) Proc Natl Acad Sci USA 76:6062
Nesbet RK (2001) Phys Rev A 65:010502
Nesbet RK (2003) Adv Quantum Chem 43:1
Dreizler RM, Gross EKU (1990) Density functional theory. Springer, Berlin
Davidson ER (1976) Reduced density matrices in quantum chemistry. Academic, New York
Perdew JP, Levy M (1985) Phys Rev B 31:6264
Englisch H, Englisch R (1983) Physica A 121:253
Zhang YA, Wang YA (2009) Int J Quantum Chem 109:3199
Milne RD (1980) Applied functional analysis: an introductory treatment. Pitman Publishing, UK
Acknowledgements
Financial support for this project was provided by a grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1
Here, we will briefly introduce some mathematical concepts relevant to our discussion in the main text. All the following content are adopted from an introductory book on functional analysis [26].
Definition 1
A vector space \(\mathcal {V}\) is a set of elements called vectors with two operations called addition and scalar multiplication, which satisfy the following axioms.
-
Addition axioms: To every pair of vectors \(x,y \in \mathcal {V}\), there corresponds a unique vector \(x+y \in \mathcal {V}\), the sum of x and y, such that
-
1.
\(x+y=y+x\);
-
2.
\((x+y)+z=x+(y+z)\);
-
3.
there exists a unique zero vector \(\theta \in \mathcal {V}\) such that \(x+\theta = \theta +x =x, \forall x\in \mathcal {V}\);
-
4.
for every vector x there exists a unique vector \((-x)\in \mathcal {V}\) such that \(x+(-x)=\theta \).
-
1.
-
Scalar multiplication axioms: To every scalar \(\alpha \) and every vector \(x\in \mathcal {V}\) there corresponds a unique vector \(\alpha x \in \mathcal {V}\) such that
-
1.
\(\alpha (\beta x) = (\alpha \beta ) x \) for every scalar \(\beta \);
-
2.
\(1x=x, 0x=0, \forall x \in \mathcal {V}\);
-
3.
\(\alpha (x+y) = \alpha x + \alpha y\) and \((\alpha + \beta ) x = \alpha x + \beta x\).
-
1.
Definition 2
If x and y are two points of a vector space, then the line segment joining them is the set of elements \(\{\beta x + (1-\beta )y\, |\, 0 \le \beta \le 1\}\). A subset S of a vector space is convex if the line segment of joining any two points in S is contained in S.
Definition 3
Let \(\mathcal {U}\) and \(\mathcal {V}\) be two vector spaces with the same system of scalars. Then a function (or mapping) that maps uniquely the elements of \(\mathcal {V}\) onto elements of \(\mathcal {U}\),
is called a linear transformation of \(\mathcal {V}\) into \(\mathcal {U}\) if
-
1.
\(T(x+y) = Tx + Ty, \forall x,y \in \mathcal {V}\);
-
2.
\(T(\alpha x)=\alpha T x, \forall x \in \mathcal {V}\) and for all scalars \(\alpha \).
Definition 4
A metric (or distance function) on a set S is a real-valued function d(x, y) defined for all pairs of elements x and y in S and which satisfies the following axioms:
-
1.
\(d(x,y)>0; d(x,y)=0\), if and only if \(x=y\);
-
2.
\(d(x,y)=d(y,x), \forall x,y \in S\);
-
3.
\(d(x,z)\le d(x,y)+ d(y,z), \forall x,y,z \in S\).
A metric space denoted by (S, d) consists of a set S and a metric d on S.
Definition 5
Let T be an operator (mapping, transformation) whose domain Dom(T) and range Ran(T) belong to metric spaces \((X,d_X)\) and \((Y,d_Y)\), respectively. The operator T is continuous at point \(x_0 \in Dom(T)\) if, for every \(\epsilon > 0\), there exists \(\delta >0\) such that
whenever
Definition 6
A sequence \(\{x^{(k)} \}\) in a metric space (S, d) is said to be a Cauchy sequence if \(d(x^{(k)},x^{(l)})\rightarrow 0\) as \(k,l\rightarrow \infty \). This means that for every \(\delta >0\) there exists \(N_{\delta }\) such that \(d(x^{(k)},x^{(l)})\le \delta \) for any \(k,l\ge N_{\delta }\).
Definition 7
A metric space (S, d) is said to be complete if every Cauchy sequence in (S, d) has a limit in (S, d).
Definition 8
A norm (or length function) on a vector space \(\mathcal {V}\) is a real-valued function, ||x||, defined for all vectors \(x\in \mathcal {V}\) and which satisfies the following axioms:
-
1.
\(||x||>0\); \(||x||=0\) if and only if \(x=\theta \);
-
2.
\(||x + y ||\le ||x|| + ||y||\), \(\forall x, y \in \mathcal {V}\);
-
3.
\(||\lambda x|| =|\lambda | \cdot ||x||\), for an arbitrary scalar \(\lambda \).
A normed vector space, denoted by \((\mathcal {V}, ||\cdot ||)\) consists of a vector space \(\mathcal {V}\) and a norm \(||\cdot ||\) on \(\mathcal {V}\).
Definition 9
A complete (with respect to the norm) normed vector space is called a Banach space.
Definition 10
Let \(T: \mathcal {V}\rightarrow \mathcal {U}\) be a bounded linear transformation, that is,
The smallest value of K which satisfies this inequality is denoted by ||T|| and called the norm of T. It can be verified that this norm for operators satisfies the axioms for a norm function and that we may therefore talk of the vector space of bounded linear transformations \(T: \mathcal {V} \rightarrow \mathcal {U}\). This normed vector space is denoted by \(\mathcal {L}(\mathcal {V},\mathcal {U})\).
Definition 11
Consider an operator \(T: \mathcal {V}\rightarrow \mathcal {U}\) where \(\mathcal {V}\) is a vector space and \(\mathcal {U}\) is a normed vector space. Let the domain of the operator T, \(Dom(T)\subset \mathcal {V}\), and \(s\in \mathcal {V}\): if the limit
exists, it is called the Gâteaux differential of T at x in the direction s. The limit is to be understood in the sense of convergence with respect to the norm in \(\mathcal {U}\). The differential may exist for some s and fail to exist for others: if the differential exists at x for all s we say that T is Gâteaux differentiable at x.
The Gâteaux differential is homogeneous in s in the sense that
but is in general neither linear nor continuous in s. Nor does the existence of the Gâteaux differential at x ensure continuity of T at x. For example,
At point (0, 0), it can be easily shown that the Gâteaux differential exists and it is zero. Clearly, the Gâteaux differential is a continuous linear operator. However, f is not continuous at (0,0). Therefore, we cannot relate the Gâteaux differentiability of T to the continuity of T.
Let us go forward on the basis that \(\mathcal {V}\) is also a normed vector space. Suppose dT(x; s) is linear and continuous in s for some \(x\in \mathcal {V}\), then we may write
The operator \(T'_G\) is by definition, a mapping \(\mathcal {V}\rightarrow \mathcal {U}\) and is linear and continuous: we may conclude that
This operator is called the Gâteaux or weak derivative of T at x. It is very important to note that when speaking of the linearity and continuity of \(T'_{G}(x)\), we means those properties in the operator sense with respect to a fixed s. \(T'_{G}\) itself may be a function of x, but its continuity and linearity with respect to the variable x are complete different things from the continuity and linearity we discussed here.
When \(T'_{G}(x)\) exists, it is certainly true that
where \(\epsilon /\lambda \rightarrow 0\) as \(\lambda \rightarrow 0\) with x and s fixed. However, the convergence may not be uniform with respect to s and in that case T cannot be approximated by a linear operator with uniform accuracy in the neighborhood of x. If we further demand uniform convergence then we arrive at the strong derivative.
Definition 12
Let \(\mathcal {V}\) and \(\mathcal {U}\) be normed vector spaces. An operator \(T: \mathcal {V}\rightarrow \mathcal {U}\) is Fréchet differentiable at \(x\in Dom(T)\subset \mathcal {V}\) if there exists a continuous linear operator \(T'_F(x) \in \mathcal {L}(\mathcal {V},\mathcal {U})\) such that, for all \(s\in \mathcal {V}\),
with
The operator \(T'_F(x)\) is called the Fréchet or strong derivative of T at x. The Fréchet derivative at x is unique. It can be shown that the existence of the Fréchet derivative of T at x implies continuity of T at x.
Theorem 1
If the Gâteaux derivative \(T'_{G}(x)\) exists in the neighborhood of x and is continuous with respect to the norm in \(\mathcal {L}(\mathcal {V},\mathcal {U})\) at x, then the Fréchet derivative \(T'_F(x)\) exists and is equal to \(T'_{G}(x)\).
Appendix 2
In this appendix, we show that the Fréchet derivative does not exist in the normalized density domain, \(\mathcal {J}_N\).
Define a normalized path wavefunction,
where \(\Psi _0\) is the GS wavefunction for an N-electron quantum system, \(\Psi _{\mathcal D}\) is a linear combination of eigenfunctions in \(\mathcal {D}\) of \(\Psi _0\), and \(0\le \beta \le 1\). Both \(\Psi _0\) and \(\Psi _{\mathcal D}\) are normalized to 1. The corresponding path density is
When \(\beta \) approaches 0, \(\rho _{p}(\mathbf {r})\) also approaches \(\rho _0(\mathbf {r})\). Letting \(\beta \) changes continuously from 1 to 0, we obtain the desired density variational path. Equation (114) shows that the path density is automatically normalized to N, therefore the density variation stays within the normalized space. Clearly, \(\rho _{p}(\mathbf{r})\) lies in the neighborhood of \(\rho _{0}(\mathbf{r})\) within \(\mathcal {J}_N\). For convenience, we label \(\mathcal {B}_N\) as the set of all legitimate N-representable \(\rho _{p}(\mathbf{r})\) defined for a given \(\Psi _{0}\) or \(\rho _{0}(\mathbf{r})\) in Eq. (114).
A trial wavefunction is then assumed to yield the same path density:
where \(\Psi _i\) is the ith normalized eigenfunction of \(\hat{H}\), \(\langle \Psi _t|\Psi _t\rangle =1\), and the expansion coefficients \(\{c_i\}\) are chosen to be real. The complete set of \(\{\Psi _i\}\) can be divided into three parts: \(\Psi _0\), \(\mathcal S\), and \(\mathcal D\). The electron density (the trial density) for \(\widetilde{\Psi }\) takes the following form:
At any point, the trial density is identical to the path density to ensure that the density variation is actually along the path we designed:
Therefore, we have
Substituting Eqs. (114) and (116) into Eq. (118) and simplifying the result, one derives
At one specific point on the variational path, the value of \(\beta \) is fixed, we can solve \(\lambda \) in terms of \(\beta \) based on Eq. (119):
Near the end of the variational path, when \(\beta \rightarrow 0\) and \(c_0 \ne 0\),
Again (see Appendix 3), the positive sign is chosen in Eq. (121), and we have
Immediately, we can conclude that towards the end of variational path, \(\lambda \) is of the same magnitude of \(\beta ^2/c_0\). In other words, \(\lambda \) also approaches zero at nearly the same rate as \(\beta ^2/c_0\) approaches zero.
Because of Eqs. (114), (116), and (117), we obtain
Substituting Eq. (119) into Eq. (123) yields
where the summation on the LHS is only within \(\mathcal {S}_0\). At \(\beta \rightarrow 0\), we find that the coefficients \(\{c_i\}\) for \(\Psi _i \in {\mathcal S}_0\) are linear in \(\lambda \).
After knowing the property of \(\{c_i\}\) for wavefunctions in \(\mathcal {S}_0\), we then investigate other remaining \(\{c_i\}\) for wavefunctions in \(\mathcal {D}\). At one particular point on the variational path (\(\beta \) fixed), we optimize trial wavefunction to find out the set of coefficients \(\{c_i\}\) that yields the lowest energy for
where Eq. (119) has been used to simplify the expression after the second equal sign. Obviously, we only need to minimize the last term in Eq. (125) under the following two constraints:
and
The density constraint, Eq. (127), is equivalent to the following identity based on our previous analysis:
We will use the Euler-Lagrange multiplier method to find the set of coefficients \(\{c_{i}\}\) that minimizes the value of \(\lambda ^{2}\left( \langle \Psi _{t}|\hat{H}|\Psi _{t}\rangle -E_{0}\right) \). Let
and
where h and \(g(\mathbf{r})\) are the Lagrange multipliers corresponding to the two constraints in Eqs. (129) and (130). Minimizing Eq. (131) with respect to \(\{c_{i}\}\), one obtains
and
Because \(c_0\) is linear in \(\lambda \) as we previously showed, we can readily infer from Eq. (132) that \(g(\mathbf{r})\) must take the following form:
Substituting Eq. (134) into Eq. (133) and ignoring the higher-order terms as \(\lambda \rightarrow 0\), we obtain an equation for \(\Psi _i\in {\mathcal S}\),
where “h.o.” denotes higher-order terms in \(\lambda \). Therefore, we reach the same conclusion as before: \(\{c_i\}\) for \(\Psi _i \in \mathcal S\) is linear in \(\lambda \) towards the end of variational path. For those \(\{c_i\}\) for \(\Psi _i \in \mathcal D\), utilizing the additional fact that \(\Psi _i\) is order-1 strongly orthogonal to \(\Psi _0\), we can further simplify Eq. (133) to
For this equation to be valid at \(\lambda \rightarrow 0\), the LHS and the RHS must have the same dependence on \(\lambda \). On the RHS, the first term decays faster than the second term, and the second term will dominate when \(\lambda \) approaches 0. Therefore, we must match the magnitude of the second term on the RHS to the LHS. Of course, we cannot match it with the second term on the LHS because doing so will lead to self inconsistency. Then, the second term on the RHS must decay in the same way as the first term on the LHS. Thus, \(\{c_i\}\) for \(\Psi _i \in \mathcal D\) are proportional to \(\lambda ^3\). Unfortunately, such a \(\lambda ^{3}\)-behavior is contradictory to the normalization constraint in Eq. (126), because \(\sum _i c_{i}^{2}\) will become 0 as \(\lambda \rightarrow 0\). Hence, we conclude that this contradiction must come from the assumption: \({\Psi _{t}=\sum _i c_{i}\Psi _{i}}\), where the expansion is over the complete set of eigenfunctions of \(\hat{H}\).
To resolve the contradiction, we have to modify our assumption about the expansion of \(\Psi _{t}\). We notice that if the summation \(\sum _{i}c_{i}\Psi _{i}\) includes any wavefunction from \(\mathcal {S}_{0}\), the same problem will persist. Therefore, \(\Psi _{t}\) can only be expanded in \(\mathcal {D}\),
In this case, Eq. (127) is equivalent to
Integrating both sides of Eq. (138) over the entire space, one obtains
which further ensures that
Now, the original minimization process is reduced to minimizing the following term,
Suppose this minimization will yield the optimal set of expansion coefficients, \(\{\bar{c}_i\}\), which have no dependence on \(\lambda \) and \(\beta \) from the appearance of Eq. (141). Then, we have
where Eq. (139) is used to simplify the expression after the third equal sign. Evidently, Eq. (142) suggests that the condition for Fréchet differentiability proposed by Lindgren and Salomonson [8,9,10] is not fulfilled. In other words, the Fréchet derivative does not exist in the normalized density domain \(\mathcal {J}_N\) either.
Appendix 3
In this appendix, we analyze the consequence of choosing the negative sign in Eqs. (41) and (121). In the end, we will conclude that this particular choice is fully equivalent to the more natural decision made in the main text and Appendix 2.
Let us start from a unified version of Eqs. (40) and (120):
where constant a = 1 and \(\sqrt{1-\beta ^2}\) in the main text and Appendix 2, respectively. Obviously, if \(c_0 = 0\) or \(c_0 \rightarrow 0\) as \(\beta \rightarrow 0\), both \(\lambda \) and \(\beta \) approach 0 concurrently near the end of the variational path.
We only need to further examine the situation when \(c_0 \ne 0\) as \(\beta \rightarrow 0\) with the choice of the negative sign in Eq. (143):
where the residual term \(\lambda '\) approaches 0 as \(\beta \rightarrow 0\):
Consequently, Eqs. (46) and (124) can be rewritten as
which immediately suggests that as \(\beta \rightarrow 0\), \((\rho _{0}-\rho _{t})\) and the coefficients \(\{c_i\}\) for \(\Psi _i \in {\mathcal S}\) are linear in \(\lambda '\). Because \(\rho _{t} \rightarrow \rho _{0}\), \(\Psi _t \rightarrow c_0 \Psi _0\) with \(|c_0| \rightarrow 1\), as \(\beta \rightarrow 0\). Therefore, at the end of the variational path (\(\beta =0\) and \(\lambda ' = 0\)), \(\lambda = -2ac_0\), \(\Psi _t = c_0 \Psi _0\), \(|c_0| = 1\), and \(\widetilde{\Psi } = -a \Psi _0\).
Evidently, the choice of the negative sign in Eqs. (41) and (121) yields a fully equivalent, alternative trial wavefunction,
where \(\lambda ' \rightarrow 0\) as \(\beta \rightarrow 0\). Then, we can carry out the discussion on the basis of \(\lambda ' \rightarrow 0\) instead.
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Xiang, P., Wang, Y.A. (2018). Functional Derivatives and Differentiability in Density-Functional Theory. In: Wang, Y., Thachuk, M., Krems, R., Maruani, J. (eds) Concepts, Methods and Applications of Quantum Systems in Chemistry and Physics. Progress in Theoretical Chemistry and Physics, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-74582-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-74582-4_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74581-7
Online ISBN: 978-3-319-74582-4
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)