Skip to main content

Functional Derivatives and Differentiability in Density-Functional Theory

  • Conference paper
  • First Online:
Concepts, Methods and Applications of Quantum Systems in Chemistry and Physics

Part of the book series: Progress in Theoretical Chemistry and Physics ((PTCP,volume 31))

  • 653 Accesses

Abstract

Based on Lindgren and Salomonson’s analysis on Fréchet differentiability [Phys Rev A 67:056501 (2003)], we showed a specific variational path along which the Fréchet derivative of the Levy-Lieb functional does not exist in the unnormalized density domain. This conclusion still holds even when the density is restricted within a normalized space. Furthermore, we extended our analysis to the Lieb functional and demonstrated that the Lieb functional is not Fréchet differentiable. Along our proposed variational path, the Gâteaux derivative of the Levy-Lieb functional or the Lieb functional takes a different form from the corresponding one along other more conventional variational paths. This fact prompted us to define a new class of unconventional density variations and inspired us to present a modified density variation domain to eliminate the problems associated with such unconventional density variations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Parr RG, Yang W (1989) Density-functional theory of atoms and molecules. Oxford University Press, New York

    Google Scholar 

  2. Hohenberg P, Kohn W (1964) Phys Rev 136:B864

    Article  Google Scholar 

  3. Kohn W, Sham LJ (1965) Phys Rev 140:A1133

    Article  Google Scholar 

  4. Wang YA, Xiang P (2013) In: Wesolowski TA, Wang YA (eds) Recent advances in orbital-free density functional theory, Chap. 1. World Scientific, Singapore, pp 3–12

    Google Scholar 

  5. Lieb EH (1983) Int J Quantum Chem 24:243

    Article  CAS  Google Scholar 

  6. Englisch H, Englisch R (1983) Phys Stat Sol 123:711

    Article  Google Scholar 

  7. Englisch H, Englisch R (1984) Phys Stat Sol 124:373

    Article  Google Scholar 

  8. Lindgren I, Salomonson S (2003) Phys Rev A 67:056501

    Article  CAS  Google Scholar 

  9. Lindgren I, Salomonson S (2003) Adv Quantum Chem 43:95

    Article  CAS  Google Scholar 

  10. Lindgren I, Salomonson S (2004) Phys Rev A 70:032509

    Article  CAS  Google Scholar 

  11. Ekeland I, Temam R (1976) Convex analysis and variational problems. North-Holland, Amsterdam

    Google Scholar 

  12. Harris J, Jones RO (1974) J Phys F 4:1170

    Article  Google Scholar 

  13. Harris J (1984) Phys Rev A 29:1648

    Article  CAS  Google Scholar 

  14. Gunnarsson O, Lundqvist BI (1976) Phys Rev B 13:4274

    Article  CAS  Google Scholar 

  15. Langreth DC, Perdew JP (1980) Phys Rev B 21:5469

    Article  Google Scholar 

  16. Wang YA (1997) Phys Rev A 55:4589

    Article  CAS  Google Scholar 

  17. Wang YA (1997) Phys Rev A 56:1646

    Article  CAS  Google Scholar 

  18. Levy M (1979) Proc Natl Acad Sci USA 76:6062

    Article  CAS  PubMed  Google Scholar 

  19. Nesbet RK (2001) Phys Rev A 65:010502

    Article  CAS  Google Scholar 

  20. Nesbet RK (2003) Adv Quantum Chem 43:1

    Article  CAS  Google Scholar 

  21. Dreizler RM, Gross EKU (1990) Density functional theory. Springer, Berlin

    Book  Google Scholar 

  22. Davidson ER (1976) Reduced density matrices in quantum chemistry. Academic, New York

    Google Scholar 

  23. Perdew JP, Levy M (1985) Phys Rev B 31:6264

    Article  CAS  Google Scholar 

  24. Englisch H, Englisch R (1983) Physica A 121:253

    Article  Google Scholar 

  25. Zhang YA, Wang YA (2009) Int J Quantum Chem 109:3199

    Article  CAS  Google Scholar 

  26. Milne RD (1980) Applied functional analysis: an introductory treatment. Pitman Publishing, UK

    Google Scholar 

Download references

Acknowledgements

Financial support for this project was provided by a grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Alexander Wang .

Editor information

Editors and Affiliations

Appendices

Appendix 1

Here, we will briefly introduce some mathematical concepts relevant to our discussion in the main text. All the following content are adopted from an introductory book on functional analysis [26].

Definition 1

A vector space \(\mathcal {V}\) is a set of elements called vectors with two operations called addition and scalar multiplication, which satisfy the following axioms.

  • Addition axioms: To every pair of vectors \(x,y \in \mathcal {V}\), there corresponds a unique vector \(x+y \in \mathcal {V}\), the sum of x and y, such that

    1. 1.

      \(x+y=y+x\);

    2. 2.

      \((x+y)+z=x+(y+z)\);

    3. 3.

      there exists a unique zero vector \(\theta \in \mathcal {V}\) such that \(x+\theta = \theta +x =x, \forall x\in \mathcal {V}\);

    4. 4.

      for every vector x there exists a unique vector \((-x)\in \mathcal {V}\) such that \(x+(-x)=\theta \).

  • Scalar multiplication axioms: To every scalar \(\alpha \) and every vector \(x\in \mathcal {V}\) there corresponds a unique vector \(\alpha x \in \mathcal {V}\) such that

    1. 1.

      \(\alpha (\beta x) = (\alpha \beta ) x \) for every scalar \(\beta \);

    2. 2.

      \(1x=x, 0x=0, \forall x \in \mathcal {V}\);

    3. 3.

      \(\alpha (x+y) = \alpha x + \alpha y\) and \((\alpha + \beta ) x = \alpha x + \beta x\).

Definition 2

If x and y are two points of a vector space, then the line segment joining them is the set of elements \(\{\beta x + (1-\beta )y\, |\, 0 \le \beta \le 1\}\). A subset S of a vector space is convex if the line segment of joining any two points in S is contained in S.

Definition 3

Let \(\mathcal {U}\) and \(\mathcal {V}\) be two vector spaces with the same system of scalars. Then a function (or mapping) that maps uniquely the elements of \(\mathcal {V}\) onto elements of \(\mathcal {U}\),

$$\begin{aligned} T: \mathcal {V} \rightarrow \mathcal {U} \end{aligned}$$
(101)

is called a linear transformation of \(\mathcal {V}\) into \(\mathcal {U}\) if

  1. 1.

    \(T(x+y) = Tx + Ty, \forall x,y \in \mathcal {V}\);

  2. 2.

    \(T(\alpha x)=\alpha T x, \forall x \in \mathcal {V}\) and for all scalars \(\alpha \).

Definition 4

A metric (or distance function) on a set S is a real-valued function d(xy) defined for all pairs of elements x and y in S and which satisfies the following axioms:

  1. 1.

    \(d(x,y)>0; d(x,y)=0\), if and only if \(x=y\);

  2. 2.

    \(d(x,y)=d(y,x), \forall x,y \in S\);

  3. 3.

    \(d(x,z)\le d(x,y)+ d(y,z), \forall x,y,z \in S\).

A metric space denoted by (Sd) consists of a set S and a metric d on S.

Definition 5

Let T be an operator (mapping, transformation) whose domain Dom(T) and range Ran(T) belong to metric spaces \((X,d_X)\) and \((Y,d_Y)\), respectively. The operator T is continuous at point \(x_0 \in Dom(T)\) if, for every \(\epsilon > 0\), there exists \(\delta >0\) such that

$$\begin{aligned} d_{Y}(Tx,Tx_0)< \epsilon \end{aligned}$$
(102)

whenever

$$\begin{aligned} d_{X}(x,x_0)< \delta \ . \end{aligned}$$
(103)

Definition 6

A sequence \(\{x^{(k)} \}\) in a metric space (Sd) is said to be a Cauchy sequence if \(d(x^{(k)},x^{(l)})\rightarrow 0\) as \(k,l\rightarrow \infty \). This means that for every \(\delta >0\) there exists \(N_{\delta }\) such that \(d(x^{(k)},x^{(l)})\le \delta \) for any \(k,l\ge N_{\delta }\).

Definition 7

A metric space (Sd) is said to be complete if every Cauchy sequence in (Sd) has a limit in (Sd).

Definition 8

A norm (or length function) on a vector space \(\mathcal {V}\) is a real-valued function, ||x||, defined for all vectors \(x\in \mathcal {V}\) and which satisfies the following axioms:

  1. 1.

    \(||x||>0\); \(||x||=0\) if and only if \(x=\theta \);

  2. 2.

    \(||x + y ||\le ||x|| + ||y||\), \(\forall x, y \in \mathcal {V}\);

  3. 3.

    \(||\lambda x|| =|\lambda | \cdot ||x||\), for an arbitrary scalar \(\lambda \).

A normed vector space, denoted by \((\mathcal {V}, ||\cdot ||)\) consists of a vector space \(\mathcal {V}\) and a norm \(||\cdot ||\) on \(\mathcal {V}\).

Definition 9

A complete (with respect to the norm) normed vector space is called a Banach space.

Definition 10

Let \(T: \mathcal {V}\rightarrow \mathcal {U}\) be a bounded linear transformation, that is,

$$\begin{aligned} ||Tx||\le K||x||. \end{aligned}$$
(104)

The smallest value of K which satisfies this inequality is denoted by ||T|| and called the norm of T. It can be verified that this norm for operators satisfies the axioms for a norm function and that we may therefore talk of the vector space of bounded linear transformations \(T: \mathcal {V} \rightarrow \mathcal {U}\). This normed vector space is denoted by \(\mathcal {L}(\mathcal {V},\mathcal {U})\).

Definition 11

Consider an operator \(T: \mathcal {V}\rightarrow \mathcal {U}\) where \(\mathcal {V}\) is a vector space and \(\mathcal {U}\) is a normed vector space. Let the domain of the operator T, \(Dom(T)\subset \mathcal {V}\), and \(s\in \mathcal {V}\): if the limit

$$\begin{aligned} dT(x;s) =\underset{\lambda \rightarrow 0}{\lim } \frac{T(x+\lambda s)-T(x)}{\lambda } \end{aligned}$$
(105)

exists, it is called the Gâteaux differential of T at x in the direction s. The limit is to be understood in the sense of convergence with respect to the norm in \(\mathcal {U}\). The differential may exist for some s and fail to exist for others: if the differential exists at x for all s we say that T is Gâteaux differentiable at x.

The Gâteaux differential is homogeneous in s in the sense that

$$\begin{aligned} dT(x;\alpha s) =\alpha dT(x;s) \end{aligned}$$
(106)

but is in general neither linear nor continuous in s. Nor does the existence of the Gâteaux differential at x ensure continuity of T at x. For example,

$$\begin{aligned} f(\xi _1, \xi _2) = \left\{ \begin{array}{ll} \frac{\xi _{1}^3}{\xi _2} &{} (\xi _1,\xi _2 \ne 0)\\ 0 &{} (\xi _1 = \xi _2 =0) \end{array} \right. \ . \end{aligned}$$
(107)

At point (0, 0), it can be easily shown that the Gâteaux differential exists and it is zero. Clearly, the Gâteaux differential is a continuous linear operator. However, f is not continuous at (0,0). Therefore, we cannot relate the Gâteaux differentiability of T to the continuity of T.

Let us go forward on the basis that \(\mathcal {V}\) is also a normed vector space. Suppose dT(xs) is linear and continuous in s for some \(x\in \mathcal {V}\), then we may write

$$\begin{aligned} dT(x;s) = \underset{\lambda \rightarrow 0}{\lim } \frac{T(x+\lambda s)-T(x)}{\lambda } = T'_{G}(x)s \ . \end{aligned}$$
(108)

The operator \(T'_G\) is by definition, a mapping \(\mathcal {V}\rightarrow \mathcal {U}\) and is linear and continuous: we may conclude that

$$\begin{aligned} T'_{G}(x) \in \mathcal {L} (\mathcal {V},\mathcal {U}) \ . \end{aligned}$$
(109)

This operator is called the Gâteaux or weak derivative of T at x. It is very important to note that when speaking of the linearity and continuity of \(T'_{G}(x)\), we means those properties in the operator sense with respect to a fixed s. \(T'_{G}\) itself may be a function of x, but its continuity and linearity with respect to the variable x are complete different things from the continuity and linearity we discussed here.

When \(T'_{G}(x)\) exists, it is certainly true that

$$\begin{aligned} T(x+\lambda s) - T(x) = T'_{G}(x)\lambda s + \epsilon (x,s,\lambda ) \ , \end{aligned}$$
(110)

where \(\epsilon /\lambda \rightarrow 0\) as \(\lambda \rightarrow 0\) with x and s fixed. However, the convergence may not be uniform with respect to s and in that case T cannot be approximated by a linear operator with uniform accuracy in the neighborhood of x. If we further demand uniform convergence then we arrive at the strong derivative.

Definition 12

Let \(\mathcal {V}\) and \(\mathcal {U}\) be normed vector spaces. An operator \(T: \mathcal {V}\rightarrow \mathcal {U}\) is Fréchet differentiable at \(x\in Dom(T)\subset \mathcal {V}\) if there exists a continuous linear operator \(T'_F(x) \in \mathcal {L}(\mathcal {V},\mathcal {U})\) such that, for all \(s\in \mathcal {V}\),

$$\begin{aligned} T(x+s) - T(x) = T'_F(x)s + \epsilon (x;s) \end{aligned}$$
(111)

with

$$\begin{aligned} \underset{||s||_{\mathcal {V}} \rightarrow 0}{\lim } \frac{||\epsilon (x;s)||_{\mathcal {U}}}{||s||_{\mathcal {V}}} =0 \ . \end{aligned}$$
(112)

The operator \(T'_F(x)\) is called the Fréchet or strong derivative of T at x. The Fréchet derivative at x is unique. It can be shown that the existence of the Fréchet derivative of T at x implies continuity of T at x.

Theorem 1

If the Gâteaux derivative \(T'_{G}(x)\) exists in the neighborhood of x and is continuous with respect to the norm in \(\mathcal {L}(\mathcal {V},\mathcal {U})\) at x, then the Fréchet derivative \(T'_F(x)\) exists and is equal to \(T'_{G}(x)\).

Appendix 2

In this appendix, we show that the Fréchet derivative does not exist in the normalized density domain, \(\mathcal {J}_N\).

Define a normalized path wavefunction,

$$\begin{aligned} \Psi _{p}=\sqrt{1-\beta ^2}\Psi _0 + \beta \Psi _{\mathcal D} \ , \end{aligned}$$
(113)

where \(\Psi _0\) is the GS wavefunction for an N-electron quantum system, \(\Psi _{\mathcal D}\) is a linear combination of eigenfunctions in \(\mathcal {D}\) of \(\Psi _0\), and \(0\le \beta \le 1\). Both \(\Psi _0\) and \(\Psi _{\mathcal D}\) are normalized to 1. The corresponding path density is

$$\begin{aligned} \rho _{p}(\mathbf {r})= & {} N\langle \Psi _{p}|\Psi _{p}\rangle _{N-1} \nonumber \\= & {} (1-\beta ^2)N\langle \Psi _{0}|\Psi _{0}\rangle _{N-1} + \beta ^{2}N\langle \Psi _{\mathcal D}|\Psi _{\mathcal D}\rangle _{N-1} \nonumber \\= & {} (1-\beta ^2)\rho _0(\mathbf {r}) + \beta ^2\rho _{\mathcal D}(\mathbf {r}) \ . \end{aligned}$$
(114)

When \(\beta \) approaches 0, \(\rho _{p}(\mathbf {r})\) also approaches \(\rho _0(\mathbf {r})\). Letting \(\beta \) changes continuously from 1 to 0, we obtain the desired density variational path. Equation (114) shows that the path density is automatically normalized to N, therefore the density variation stays within the normalized space. Clearly, \(\rho _{p}(\mathbf{r})\) lies in the neighborhood of \(\rho _{0}(\mathbf{r})\) within \(\mathcal {J}_N\). For convenience, we label \(\mathcal {B}_N\) as the set of all legitimate N-representable \(\rho _{p}(\mathbf{r})\) defined for a given \(\Psi _{0}\) or \(\rho _{0}(\mathbf{r})\) in Eq. (114).

A trial wavefunction is then assumed to yield the same path density:

$$\begin{aligned} \widetilde{\Psi } = \sqrt{1-\beta ^2}\Psi _0 + \lambda \Psi _t = \sqrt{1-\beta ^2}\Psi _0 + \lambda \sum _{i=0}^{\infty }c_i\Psi _i\longmapsto \rho _{p}(\mathbf {r}) \ , \end{aligned}$$
(115)

where \(\Psi _i\) is the ith normalized eigenfunction of \(\hat{H}\), \(\langle \Psi _t|\Psi _t\rangle =1\), and the expansion coefficients \(\{c_i\}\) are chosen to be real. The complete set of \(\{\Psi _i\}\) can be divided into three parts: \(\Psi _0\), \(\mathcal S\), and \(\mathcal D\). The electron density (the trial density) for \(\widetilde{\Psi }\) takes the following form:

$$\begin{aligned} \widetilde{\rho }(\mathbf{r})= & {} N\langle \widetilde{\Psi }|\widetilde{\Psi }\rangle _{N-1}\nonumber \\= & {} (1-\beta ^2)N\langle \Psi _{0}|\Psi _{0}\rangle _{N-1}+\lambda ^{2}N\langle \Psi _{t}|\Psi _{t}\rangle _{N-1} + 2\lambda \sqrt{1-\beta ^2} N Re\!\left( \langle \Psi _{0}|\Psi _{t}\rangle _{N-1}\right) \nonumber \\= & {} (1-\beta ^2)\rho _{0}(\mathbf{r})+\lambda ^{2}\rho _{t}(\mathbf{r}) + 2\lambda \sqrt{1-\beta ^2} N Re\!\left( \langle \Psi _{0}|\Psi _{t}\rangle _{N-1}\right) \ . \end{aligned}$$
(116)

At any point, the trial density is identical to the path density to ensure that the density variation is actually along the path we designed:

$$\begin{aligned} \widetilde{\rho }(\mathbf{r})=\rho _{p}(\mathbf{r})\rightarrow \rho _{0}(\mathbf{r})\ . \end{aligned}$$
(117)

Therefore, we have

$$\begin{aligned} \left\langle \widetilde{\rho }(\mathbf{r})\right\rangle =\left\langle \rho _{p}(\mathbf{r})\right\rangle \ . \end{aligned}$$
(118)

Substituting Eqs. (114) and (116) into Eq. (118) and simplifying the result, one derives

$$\begin{aligned} \beta ^{2}=\lambda ^2 + 2\lambda c_0 \sqrt{1-\beta ^2} \ . \end{aligned}$$
(119)

At one specific point on the variational path, the value of \(\beta \) is fixed, we can solve \(\lambda \) in terms of \(\beta \) based on Eq. (119):

$$\begin{aligned} \lambda =-c_0 \sqrt{1-\beta ^2}\pm \sqrt{c_0^2 (1-\beta ^2) + \beta ^2} \ . \end{aligned}$$
(120)

Near the end of the variational path, when \(\beta \rightarrow 0\) and \(c_0 \ne 0\),

$$\begin{aligned} \lambda \rightarrow -c_0 \sqrt{1-\beta ^2} \pm \left[ c_0 \sqrt{1-\beta ^2} + \frac{1}{2 c_0}\beta ^2 + \cdots \right] \ . \end{aligned}$$
(121)

Again (see Appendix 3), the positive sign is chosen in Eq. (121), and we have

$$\begin{aligned} \lambda \rightarrow \frac{1}{2 c_0}\beta ^2 + \cdots ,\;\;\text {as}\;\beta \rightarrow 0 \ . \end{aligned}$$
(122)

Immediately, we can conclude that towards the end of variational path, \(\lambda \) is of the same magnitude of \(\beta ^2/c_0\). In other words, \(\lambda \) also approaches zero at nearly the same rate as \(\beta ^2/c_0\) approaches zero.

Because of Eqs. (114), (116), and (117), we obtain

$$\begin{aligned} \lambda ^{2}\rho _{t}+ 2N\lambda \sqrt{1-\beta ^2} Re\!\left( \left\langle \left. \Psi _{0}\right| \!\Psi _{t}\right\rangle _{N-1}\right) =\beta ^{2}\rho _{\mathcal {D}}\ . \end{aligned}$$
(123)

Substituting Eq. (119) into Eq. (123) yields

$$\begin{aligned} 2\sqrt{1-\beta ^2}\left[ c_{0} \rho _{\mathcal {D}}-N\sum _{i}^{\mathcal {S}_{0}}c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \right] = \lambda (\rho _{t}-\rho _{\mathcal {D}})\;, \end{aligned}$$
(124)

where the summation on the LHS is only within \(\mathcal {S}_0\). At \(\beta \rightarrow 0\), we find that the coefficients \(\{c_i\}\) for \(\Psi _i \in {\mathcal S}_0\) are linear in \(\lambda \).

After knowing the property of \(\{c_i\}\) for wavefunctions in \(\mathcal {S}_0\), we then investigate other remaining \(\{c_i\}\) for wavefunctions in \(\mathcal {D}\). At one particular point on the variational path (\(\beta \) fixed), we optimize trial wavefunction to find out the set of coefficients \(\{c_i\}\) that yields the lowest energy for

$$\begin{aligned} \langle \widetilde{\Psi }|\hat{H}|\widetilde{\Psi }\rangle= & {} \left\langle \! \sqrt{1-\beta ^2}\Psi _0+\lambda \Psi _{t}\left| \hat{H}\right| \sqrt{1-\beta ^2}\Psi _0+\lambda \Psi _{t}\!\right\rangle \nonumber \\= & {} E_0 -\left[ \beta ^2 - 2\lambda c_0 \sqrt{1-\beta ^2}\right] E_0 + \lambda ^2\langle \Psi _t|\hat{H}|\Psi _t\rangle \nonumber \\= & {} E_0 - \lambda ^2 E_0 + \lambda ^2\langle \Psi _t|\hat{H}|\Psi _t\rangle = E_0 + \lambda ^2\left( \langle \Psi _t|\hat{H}|\Psi _t\rangle - E_0 \right) \ , \end{aligned}$$
(125)

where Eq. (119) has been used to simplify the expression after the second equal sign. Obviously, we only need to minimize the last term in Eq. (125) under the following two constraints:

$$\begin{aligned} \sum _{i=0}^{\infty } c_{i}^{2}=1\;, \end{aligned}$$
(126)

and

$$\begin{aligned} \widetilde{\rho }(\mathbf{r})=\rho _{p}(\mathbf r)\ . \end{aligned}$$
(127)

The density constraint, Eq. (127), is equivalent to the following identity based on our previous analysis:

$$\begin{aligned}&2\lambda \sqrt{1-\beta ^2}\left[ \frac{c_{0}(\rho _{0}-\rho _{\mathcal {D}})}{N}+\sum _{i}^{\mathcal {S}}c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \right] \nonumber \\&\ \ \ = \lambda ^{2}\left( \frac{\rho _{\mathcal {D}}}{N}-\sum _{i,j}^{\infty }c_{j} c_{i}\left\langle \left. \Psi _{j}\right| \!\Psi _{i}\right\rangle _{N-1}\!\right) \ . \end{aligned}$$
(128)

We will use the Euler-Lagrange multiplier method to find the set of coefficients \(\{c_{i}\}\) that minimizes the value of \(\lambda ^{2}\left( \langle \Psi _{t}|\hat{H}|\Psi _{t}\rangle -E_{0}\right) \). Let

$$\begin{aligned} \mathbf{A}=\sum _{i=0}^{\infty } c_{i}^{2}\ -1\;, \end{aligned}$$
(129)
$$\begin{aligned} \mathbf{B}= & {} 2\lambda \sqrt{1-\beta ^2}\left[ \frac{c_{0}(\rho _{0}-\rho _{\mathcal {D}})}{N}+\sum _{i}^{\mathcal {S}} c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \!\right] \nonumber \\&-\lambda ^{2}\left( \frac{\rho _{\mathcal {D}}}{N}-\sum _{i,j}^\infty c_{i} c_{j}\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\!\right) \;,\end{aligned}$$
(130)

and

$$\begin{aligned} \mathbf{\mathbf{\Omega }}= & {} \lambda ^{2}\left( \langle \Psi _{t}|\hat{H}|\Psi _{t}\rangle -E_{0}\right) -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \nonumber \\= & {} \lambda ^{2}\left( \sum _{i,j}^{\infty }c_{i} c_{j}\langle \Psi _{i}|\hat{H}|\Psi _{j}\rangle -E_{0}\right) -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \nonumber \\= & {} \lambda ^{2}\left( \sum _{i,j}^{\infty }c_{i} c_{j}E_{j}\delta _{ij}-E_{0}\right) -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \nonumber \\= & {} \lambda ^{2}\left[ \sum _{i=1}^{\infty } c_{i}^{2}(E_{i}-E_{0})\right] -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \ , \end{aligned}$$
(131)

where h and \(g(\mathbf{r})\) are the Lagrange multipliers corresponding to the two constraints in Eqs. (129) and (130). Minimizing Eq. (131) with respect to \(\{c_{i}\}\), one obtains

$$\begin{aligned} \lambda \left\langle \!\left[ \frac{\sqrt{1-\beta ^2}(\rho _{0}-\rho _{\mathcal {D}})}{N}+\lambda \sum _{j=0}^{\infty }c_{j}Re\! \left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!\right] g(\mathbf{r})\!\!\right\rangle =-h c_{0}\ , \end{aligned}$$
(132)

and

$$\begin{aligned}&\lambda \left\langle \! \left[ \sqrt{1-\beta ^2}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \Psi _{0}\right\rangle _{N-1}\right) +\lambda \sum _{j=0}^{\infty }c_{j}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!\right] g(\mathbf{r})\!\!\right\rangle \nonumber \\&\;\;\;\;=\left[ \lambda ^2(E_i - E_0) - h \right] c_{i}\;\;\;\;\;\;\;\;\; (\text {for}\;i\ne 0)\ . \end{aligned}$$
(133)

Because \(c_0\) is linear in \(\lambda \) as we previously showed, we can readily infer from Eq. (132) that \(g(\mathbf{r})\) must take the following form:

$$\begin{aligned} g_{\scriptstyle {\lambda }}(\mathbf{r})=g^{(0)}(\mathbf{r})+\sum _{k=1}^{\infty }\frac{g^{(k)}(\mathbf{r})}{k!}\lambda ^{k}\ . \end{aligned}$$
(134)

Substituting Eq. (134) into Eq. (133) and ignoring the higher-order terms as \(\lambda \rightarrow 0\), we obtain an equation for \(\Psi _i\in {\mathcal S}\),

$$\begin{aligned} - hc_{i}=\lambda \sqrt{1-\beta ^2}\left\langle Re\! \left( \!\left\langle \left. \Psi _{i}\right| \Psi _{0}\right\rangle _{N-1}\right) g^{(0)}(\mathbf{r})\!\right\rangle + h.o. \ , \end{aligned}$$
(135)

where “h.o.” denotes higher-order terms in \(\lambda \). Therefore, we reach the same conclusion as before: \(\{c_i\}\) for \(\Psi _i \in \mathcal S\) is linear in \(\lambda \) towards the end of variational path. For those \(\{c_i\}\) for \(\Psi _i \in \mathcal D\), utilizing the additional fact that \(\Psi _i\) is order-1 strongly orthogonal to \(\Psi _0\), we can further simplify Eq. (133) to

$$\begin{aligned}&\lambda ^2\left\langle \! \sum _{j}^{\mathcal {S}_{0}}c_{j}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!g(\mathbf{r})\!\!\right\rangle + \lambda ^2\left\langle \! \sum _{j}^{\mathcal D}c_{j}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!g(\mathbf{r})\!\!\right\rangle \nonumber \\&\;\;\;\;=\lambda ^2(E_i - E_0) c_i - h c_{i} \ . \end{aligned}$$
(136)

For this equation to be valid at \(\lambda \rightarrow 0\), the LHS and the RHS must have the same dependence on \(\lambda \). On the RHS, the first term decays faster than the second term, and the second term will dominate when \(\lambda \) approaches 0. Therefore, we must match the magnitude of the second term on the RHS to the LHS. Of course, we cannot match it with the second term on the LHS because doing so will lead to self inconsistency. Then, the second term on the RHS must decay in the same way as the first term on the LHS. Thus, \(\{c_i\}\) for \(\Psi _i \in \mathcal D\) are proportional to \(\lambda ^3\). Unfortunately, such a \(\lambda ^{3}\)-behavior is contradictory to the normalization constraint in Eq. (126), because \(\sum _i c_{i}^{2}\) will become 0 as \(\lambda \rightarrow 0\). Hence, we conclude that this contradiction must come from the assumption: \({\Psi _{t}=\sum _i c_{i}\Psi _{i}}\), where the expansion is over the complete set of eigenfunctions of \(\hat{H}\).

To resolve the contradiction, we have to modify our assumption about the expansion of \(\Psi _{t}\). We notice that if the summation \(\sum _{i}c_{i}\Psi _{i}\) includes any wavefunction from \(\mathcal {S}_{0}\), the same problem will persist. Therefore, \(\Psi _{t}\) can only be expanded in \(\mathcal {D}\),

$$\begin{aligned} \Psi _{t}=\sum _{i}^{\mathcal {D}}c_{i}\Psi _{i}^{\mathcal {}}\ . \end{aligned}$$
(137)

In this case, Eq. (127) is equivalent to

$$\begin{aligned} \lambda ^{2}\rho _{t}(\mathbf{r})=\beta ^{2}\rho _{\mathcal {D}}(\mathbf{r})\;. \end{aligned}$$
(138)

Integrating both sides of Eq. (138) over the entire space, one obtains

$$\begin{aligned} \lambda ^{2}=\beta ^{2}\;, \end{aligned}$$
(139)

which further ensures that

$$\begin{aligned} \rho _{t}(\mathbf{r})=\rho _{\mathcal {D}}(\mathbf{r})\;. \end{aligned}$$
(140)

Now, the original minimization process is reduced to minimizing the following term,

$$\begin{aligned} {\varvec{\Xi }} = \left( \sum _{i}^{\mathcal {D}}|c_{i}|^{2}E_{i}\right) -h\left( \sum _{i}^{\mathcal {D}}|c_{i}|^{2}-1\right) -\left\langle \!\!\left( \frac{\rho _{\mathcal {D}}}{N}-\sum _{i,j}^{\mathcal {D}}c_i^* c_j\langle \Psi _i|\Psi _j\rangle _{N-1}\right) g(\mathbf{r})\!\!\right\rangle .\nonumber \\ \end{aligned}$$
(141)

Suppose this minimization will yield the optimal set of expansion coefficients, \(\{\bar{c}_i\}\), which have no dependence on \(\lambda \) and \(\beta \) from the appearance of Eq. (141). Then, we have

$$\begin{aligned}&\underset{\scriptstyle {\Psi _{0}+\delta \Psi _{0}\rightarrow \rho _{0}+\delta \rho }}{\inf }\frac{\langle \delta \Psi |\hat{H}-E_{0}|\delta \Psi \rangle }{||\delta \rho ||} = \underset{\scriptstyle {\Psi _{0}+\delta \Psi \rightarrow \rho _{0}+\delta \rho }}{\inf }\frac{\langle \widetilde{\Psi }-\Psi _0|\hat{H}-E_{0}|\widetilde{\Psi }- \Psi _0\rangle }{||\rho _{p}-\rho _0||}\nonumber \\&\;\;\;\;\;\;\;= \underset{\scriptstyle {\Psi _{t}\rightarrow \rho _{\mathcal {D}}}}{\inf }\frac{\left\langle (\sqrt{1-\beta ^2}-1)\Psi _0 + \lambda \Psi _{t}\left| \hat{H}-E_{0}\right| (\sqrt{1-\beta ^2}-1)\Psi _0 + \lambda \Psi _{t}\right\rangle }{||\beta ^{2}(\rho _{\mathcal {D}}-\rho _0)||}\nonumber \\&\;\;\;\;\;\;\;= \underset{\scriptstyle {\Psi _{t}\rightarrow \rho _{\mathcal {D}}}}{\inf }\frac{\lambda ^{2}\langle \Psi _{t}|\hat{H}-E_{0}|\Psi _{t}\rangle }{\beta ^{2}||\rho _{\mathcal {D}}-\rho _0||} = \frac{\underset{\scriptstyle {\Psi _{t}\rightarrow \rho _{\mathcal {D}}}}{\inf }\langle \Psi _{t}|\hat{H}-E_{0}|\Psi _{t}\rangle }{||\rho _{\mathcal {D}}-\rho _0||}\nonumber \\&\;\;\;\;\;\;\; =\frac{1}{||\rho _{\mathcal {D}}-\rho _0||}\displaystyle \left\langle \!\!\left. \sum _{i}^{\mathcal {D}}\bar{c}_i\Psi _{i}\right| \hat{H}-E_{0}\left| \sum _{j}^{\mathcal {D}}\bar{c}_j\Psi _{j}\right. \!\!\right\rangle = \frac{1}{||\rho _{\mathcal {D}}-\rho _0||} \left[ \sum _{i}^{\mathcal {D}}|\bar{c}_i|^2 E_i - E_0\right] \nonumber \\&\;\;\;\;\;\;\;> \frac{1}{||\rho _{\mathcal {D}}-\rho _0||} \left[ \sum _{i}^{\mathcal {D}}|\bar{c}_i|^2 E_0 - E_0\right] = 0 \ , \end{aligned}$$
(142)

where Eq. (139) is used to simplify the expression after the third equal sign. Evidently, Eq. (142) suggests that the condition for Fréchet differentiability proposed by Lindgren and Salomonson [8,9,10] is not fulfilled. In other words, the Fréchet derivative does not exist in the normalized density domain \(\mathcal {J}_N\) either.

Appendix 3

In this appendix, we analyze the consequence of choosing the negative sign in Eqs. (41) and (121). In the end, we will conclude that this particular choice is fully equivalent to the more natural decision made in the main text and Appendix 2.

Let us start from a unified version of Eqs. (40) and (120):

$$\begin{aligned} \lambda =-a c_0\pm \sqrt{a^2 c_0^2 + \beta ^2} \ , \end{aligned}$$
(143)

where constant a = 1 and \(\sqrt{1-\beta ^2}\) in the main text and Appendix 2, respectively. Obviously, if \(c_0 = 0\) or \(c_0 \rightarrow 0\) as \(\beta \rightarrow 0\), both \(\lambda \) and \(\beta \) approach 0 concurrently near the end of the variational path.

We only need to further examine the situation when \(c_0 \ne 0\) as \(\beta \rightarrow 0\) with the choice of the negative sign in Eq. (143):

$$\begin{aligned} \lambda = -2a c_0 - \lambda ' \ , \end{aligned}$$
(144)

where the residual term \(\lambda '\) approaches 0 as \(\beta \rightarrow 0\):

$$\begin{aligned} \lambda ' = \left[ \frac{1}{2ac_0}\beta ^2 - \frac{1}{8 a^3 c_0^3}\beta ^4 + \cdots \right] \rightarrow 0 \ . \end{aligned}$$
(145)

Consequently, Eqs. (46) and (124) can be rewritten as

$$\begin{aligned} 2a\!\left[ c_{0}(\rho _{0}-\rho _{t})+N\sum _{i}^{\mathcal {S}} c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \right] =\lambda '\left( \rho _{t}-\rho _{\mathcal {D}}\right) \ , \end{aligned}$$
(146)

which immediately suggests that as \(\beta \rightarrow 0\), \((\rho _{0}-\rho _{t})\) and the coefficients \(\{c_i\}\) for \(\Psi _i \in {\mathcal S}\) are linear in \(\lambda '\). Because \(\rho _{t} \rightarrow \rho _{0}\), \(\Psi _t \rightarrow c_0 \Psi _0\) with \(|c_0| \rightarrow 1\), as \(\beta \rightarrow 0\). Therefore, at the end of the variational path (\(\beta =0\) and \(\lambda ' = 0\)), \(\lambda = -2ac_0\), \(\Psi _t = c_0 \Psi _0\), \(|c_0| = 1\), and \(\widetilde{\Psi } = -a \Psi _0\).

Evidently, the choice of the negative sign in Eqs. (41) and (121) yields a fully equivalent, alternative trial wavefunction,

$$\begin{aligned} \widetilde{\Psi }' = -a \Psi _0 - \lambda '\, \Psi _t \ , \end{aligned}$$
(147)

where \(\lambda ' \rightarrow 0\) as \(\beta \rightarrow 0\). Then, we can carry out the discussion on the basis of \(\lambda ' \rightarrow 0\) instead.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiang, P., Wang, Y.A. (2018). Functional Derivatives and Differentiability in Density-Functional Theory. In: Wang, Y., Thachuk, M., Krems, R., Maruani, J. (eds) Concepts, Methods and Applications of Quantum Systems in Chemistry and Physics. Progress in Theoretical Chemistry and Physics, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-74582-4_18

Download citation

Publish with us

Policies and ethics