Functional Derivatives and Differentiability in Density-Functional Theory

Xiang, Ping; Wang, Yan Alexander

doi:10.1007/978-3-319-74582-4_18

Ping Xiang⁷ &
Yan Alexander Wang⁷

Part of the book series: Progress in Theoretical Chemistry and Physics ((PTCP,volume 31))

653 Accesses

Abstract

Based on Lindgren and Salomonson’s analysis on Fréchet differentiability [Phys Rev A 67:056501 (2003)], we showed a specific variational path along which the Fréchet derivative of the Levy-Lieb functional does not exist in the unnormalized density domain. This conclusion still holds even when the density is restricted within a normalized space. Furthermore, we extended our analysis to the Lieb functional and demonstrated that the Lieb functional is not Fréchet differentiable. Along our proposed variational path, the Gâteaux derivative of the Levy-Lieb functional or the Lieb functional takes a different form from the corresponding one along other more conventional variational paths. This fact prompted us to define a new class of unconventional density variations and inspired us to present a modified density variation domain to eliminate the problems associated with such unconventional density variations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Parr RG, Yang W (1989) Density-functional theory of atoms and molecules. Oxford University Press, New York
Google Scholar
Hohenberg P, Kohn W (1964) Phys Rev 136:B864
Article Google Scholar
Kohn W, Sham LJ (1965) Phys Rev 140:A1133
Article Google Scholar
Wang YA, Xiang P (2013) In: Wesolowski TA, Wang YA (eds) Recent advances in orbital-free density functional theory, Chap. 1. World Scientific, Singapore, pp 3–12
Google Scholar
Lieb EH (1983) Int J Quantum Chem 24:243
Article CAS Google Scholar
Englisch H, Englisch R (1983) Phys Stat Sol 123:711
Article Google Scholar
Englisch H, Englisch R (1984) Phys Stat Sol 124:373
Article Google Scholar
Lindgren I, Salomonson S (2003) Phys Rev A 67:056501
Article CAS Google Scholar
Lindgren I, Salomonson S (2003) Adv Quantum Chem 43:95
Article CAS Google Scholar
Lindgren I, Salomonson S (2004) Phys Rev A 70:032509
Article CAS Google Scholar
Ekeland I, Temam R (1976) Convex analysis and variational problems. North-Holland, Amsterdam
Google Scholar
Harris J, Jones RO (1974) J Phys F 4:1170
Article Google Scholar
Harris J (1984) Phys Rev A 29:1648
Article CAS Google Scholar
Gunnarsson O, Lundqvist BI (1976) Phys Rev B 13:4274
Article CAS Google Scholar
Langreth DC, Perdew JP (1980) Phys Rev B 21:5469
Article Google Scholar
Wang YA (1997) Phys Rev A 55:4589
Article CAS Google Scholar
Wang YA (1997) Phys Rev A 56:1646
Article CAS Google Scholar
Levy M (1979) Proc Natl Acad Sci USA 76:6062
Article CAS PubMed Google Scholar
Nesbet RK (2001) Phys Rev A 65:010502
Article CAS Google Scholar
Nesbet RK (2003) Adv Quantum Chem 43:1
Article CAS Google Scholar
Dreizler RM, Gross EKU (1990) Density functional theory. Springer, Berlin
Book Google Scholar
Davidson ER (1976) Reduced density matrices in quantum chemistry. Academic, New York
Google Scholar
Perdew JP, Levy M (1985) Phys Rev B 31:6264
Article CAS Google Scholar
Englisch H, Englisch R (1983) Physica A 121:253
Article Google Scholar
Zhang YA, Wang YA (2009) Int J Quantum Chem 109:3199
Article CAS Google Scholar
Milne RD (1980) Applied functional analysis: an introductory treatment. Pitman Publishing, UK
Google Scholar

Download references

Acknowledgements

Financial support for this project was provided by a grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.

Author information

Authors and Affiliations

Department of Chemistry, University of British Columbia, 2036 Main Mall, Vancouver, BC, V6T 1Z1, Canada
Ping Xiang & Yan Alexander Wang

Authors

Ping Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Alexander Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Alexander Wang .

Editor information

Editors and Affiliations

University of British Columbia, Vancouver, Canada
Yan A. Wang
University of British Columbia, Vancouver, Canada
Mark Thachuk
University of British Columbia, Vancouver, Canada
Roman Krems
Laboratoire de Chimie Physique, CNRS & UPMC, Paris, France
Jean Maruani

Appendices

Appendix 1

Here, we will briefly introduce some mathematical concepts relevant to our discussion in the main text. All the following content are adopted from an introductory book on functional analysis [26].

Definition 1

A vector space $\mathcal {V}$ is a set of elements called vectors with two operations called addition and scalar multiplication, which satisfy the following axioms.

Addition axioms: To every pair of vectors $x,y \in \mathcal {V}$, there corresponds a unique vector $x+y \in \mathcal {V}$, the sum of x and y, such that
1. 1.
  $x+y=y+x$;
2. 2.
  $(x+y)+z=x+(y+z)$;
3. 3.
  there exists a unique zero vector $\theta \in \mathcal {V}$ such that $x+\theta = \theta +x =x, \forall x\in \mathcal {V}$;
4. 4.
  for every vector x there exists a unique vector $(-x)\in \mathcal {V}$ such that $x+(-x)=\theta $.
Scalar multiplication axioms: To every scalar $\alpha $ and every vector $x\in \mathcal {V}$ there corresponds a unique vector $\alpha x \in \mathcal {V}$ such that
1. 1.
  $\alpha (\beta x) = (\alpha \beta ) x $ for every scalar $\beta $;
2. 2.
  $1x=x, 0x=0, \forall x \in \mathcal {V}$;
3. 3.
  $\alpha (x+y) = \alpha x + \alpha y$ and $(\alpha + \beta ) x = \alpha x + \beta x$.

Definition 2

If x and y are two points of a vector space, then the line segment joining them is the set of elements $\{\beta x + (1-\beta )y\, |\, 0 \le \beta \le 1\}$. A subset S of a vector space is convex if the line segment of joining any two points in S is contained in S.

Definition 3

Let $\mathcal {U}$ and $\mathcal {V}$ be two vector spaces with the same system of scalars. Then a function (or mapping) that maps uniquely the elements of $\mathcal {V}$ onto elements of $\mathcal {U}$,

$$\begin{aligned} T: \mathcal {V} \rightarrow \mathcal {U} \end{aligned}$$

(101)

is called a linear transformation of $\mathcal {V}$ into $\mathcal {U}$ if

1.
$T(x+y) = Tx + Ty, \forall x,y \in \mathcal {V}$;
2.
$T(\alpha x)=\alpha T x, \forall x \in \mathcal {V}$ and for all scalars $\alpha $.

Definition 4

A metric (or distance function) on a set S is a real-valued function d(x, y) defined for all pairs of elements x and y in S and which satisfies the following axioms:

1.
$d(x,y)>0; d(x,y)=0$, if and only if $x=y$;
2.
$d(x,y)=d(y,x), \forall x,y \in S$;
3.
$d(x,z)\le d(x,y)+ d(y,z), \forall x,y,z \in S$.

A metric space denoted by (S, d) consists of a set S and a metric d on S.

Definition 5

Let T be an operator (mapping, transformation) whose domain Dom(T) and range Ran(T) belong to metric spaces $(X,d_X)$ and $(Y,d_Y)$, respectively. The operator T is continuous at point $x_0 \in Dom(T)$ if, for every $\epsilon > 0$, there exists $\delta >0$ such that

$$\begin{aligned} d_{Y}(Tx,Tx_0)< \epsilon \end{aligned}$$

(102)

whenever

$$\begin{aligned} d_{X}(x,x_0)< \delta \ . \end{aligned}$$

(103)

Definition 6

A sequence $\{x^{(k)} \}$ in a metric space (S, d) is said to be a Cauchy sequence if $d(x^{(k)},x^{(l)})\rightarrow 0$ as $k,l\rightarrow \infty $. This means that for every $\delta >0$ there exists $N_{\delta }$ such that $d(x^{(k)},x^{(l)})\le \delta $ for any $k,l\ge N_{\delta }$.

Definition 7

A metric space (S, d) is said to be complete if every Cauchy sequence in (S, d) has a limit in (S, d).

Definition 8

A norm (or length function) on a vector space $\mathcal {V}$ is a real-valued function, ||x||, defined for all vectors $x\in \mathcal {V}$ and which satisfies the following axioms:

1.
$||x||>0$; $||x||=0$ if and only if $x=\theta $;
2.
$||x + y ||\le ||x|| + ||y||$, $\forall x, y \in \mathcal {V}$;
3.
$||\lambda x|| =|\lambda | \cdot ||x||$, for an arbitrary scalar $\lambda $.

A normed vector space, denoted by $(\mathcal {V}, ||\cdot ||)$ consists of a vector space $\mathcal {V}$ and a norm $||\cdot ||$ on $\mathcal {V}$.

Definition 9

A complete (with respect to the norm) normed vector space is called a Banach space.

Definition 10

Let $T: \mathcal {V}\rightarrow \mathcal {U}$ be a bounded linear transformation, that is,

$$\begin{aligned} ||Tx||\le K||x||. \end{aligned}$$

(104)

The smallest value of K which satisfies this inequality is denoted by ||T|| and called the norm of T. It can be verified that this norm for operators satisfies the axioms for a norm function and that we may therefore talk of the vector space of bounded linear transformations $T: \mathcal {V} \rightarrow \mathcal {U}$. This normed vector space is denoted by $\mathcal {L}(\mathcal {V},\mathcal {U})$.

Definition 11

Consider an operator $T: \mathcal {V}\rightarrow \mathcal {U}$ where $\mathcal {V}$ is a vector space and $\mathcal {U}$ is a normed vector space. Let the domain of the operator T, $Dom(T)\subset \mathcal {V}$, and $s\in \mathcal {V}$: if the limit

$$\begin{aligned} dT(x;s) =\underset{\lambda \rightarrow 0}{\lim } \frac{T(x+\lambda s)-T(x)}{\lambda } \end{aligned}$$

(105)

exists, it is called the Gâteaux differential of T at x in the direction s. The limit is to be understood in the sense of convergence with respect to the norm in $\mathcal {U}$. The differential may exist for some s and fail to exist for others: if the differential exists at x for all s we say that T is Gâteaux differentiable at x.

The Gâteaux differential is homogeneous in s in the sense that

$$\begin{aligned} dT(x;\alpha s) =\alpha dT(x;s) \end{aligned}$$

(106)

but is in general neither linear nor continuous in s. Nor does the existence of the Gâteaux differential at x ensure continuity of T at x. For example,

$$\begin{aligned} f(\xi _1, \xi _2) = \left\{ \begin{array}{ll} \frac{\xi _{1}^3}{\xi _2} &{} (\xi _1,\xi _2 \ne 0)\\ 0 &{} (\xi _1 = \xi _2 =0) \end{array} \right. \ . \end{aligned}$$

(107)

At point (0, 0), it can be easily shown that the Gâteaux differential exists and it is zero. Clearly, the Gâteaux differential is a continuous linear operator. However, f is not continuous at (0,0). Therefore, we cannot relate the Gâteaux differentiability of T to the continuity of T.

Let us go forward on the basis that $\mathcal {V}$ is also a normed vector space. Suppose dT(x; s) is linear and continuous in s for some $x\in \mathcal {V}$, then we may write

$$\begin{aligned} dT(x;s) = \underset{\lambda \rightarrow 0}{\lim } \frac{T(x+\lambda s)-T(x)}{\lambda } = T'_{G}(x)s \ . \end{aligned}$$

(108)

The operator $T'_G$ is by definition, a mapping $\mathcal {V}\rightarrow \mathcal {U}$ and is linear and continuous: we may conclude that

$$\begin{aligned} T'_{G}(x) \in \mathcal {L} (\mathcal {V},\mathcal {U}) \ . \end{aligned}$$

(109)

This operator is called the Gâteaux or weak derivative of T at x. It is very important to note that when speaking of the linearity and continuity of $T'_{G}(x)$, we means those properties in the operator sense with respect to a fixed s. $T'_{G}$ itself may be a function of x, but its continuity and linearity with respect to the variable x are complete different things from the continuity and linearity we discussed here.

When $T'_{G}(x)$ exists, it is certainly true that

$$\begin{aligned} T(x+\lambda s) - T(x) = T'_{G}(x)\lambda s + \epsilon (x,s,\lambda ) \ , \end{aligned}$$

(110)

where $\epsilon /\lambda \rightarrow 0$ as $\lambda \rightarrow 0$ with x and s fixed. However, the convergence may not be uniform with respect to s and in that case T cannot be approximated by a linear operator with uniform accuracy in the neighborhood of x. If we further demand uniform convergence then we arrive at the strong derivative.

Definition 12

Let $\mathcal {V}$ and $\mathcal {U}$ be normed vector spaces. An operator $T: \mathcal {V}\rightarrow \mathcal {U}$ is Fréchet differentiable at $x\in Dom(T)\subset \mathcal {V}$ if there exists a continuous linear operator $T'_F(x) \in \mathcal {L}(\mathcal {V},\mathcal {U})$ such that, for all $s\in \mathcal {V}$,

$$\begin{aligned} T(x+s) - T(x) = T'_F(x)s + \epsilon (x;s) \end{aligned}$$

(111)

with

$$\begin{aligned} \underset{||s||_{\mathcal {V}} \rightarrow 0}{\lim } \frac{||\epsilon (x;s)||_{\mathcal {U}}}{||s||_{\mathcal {V}}} =0 \ . \end{aligned}$$

(112)

The operator $T'_F(x)$ is called the Fréchet or strong derivative of T at x. The Fréchet derivative at x is unique. It can be shown that the existence of the Fréchet derivative of T at x implies continuity of T at x.

Theorem 1

If the Gâteaux derivative $T'_{G}(x)$ exists in the neighborhood of x and is continuous with respect to the norm in $\mathcal {L}(\mathcal {V},\mathcal {U})$ at x, then the Fréchet derivative $T'_F(x)$ exists and is equal to $T'_{G}(x)$.

Appendix 2

In this appendix, we show that the Fréchet derivative does not exist in the normalized density domain, $\mathcal {J}_N$.

Define a normalized path wavefunction,

$$\begin{aligned} \Psi _{p}=\sqrt{1-\beta ^2}\Psi _0 + \beta \Psi _{\mathcal D} \ , \end{aligned}$$

(113)

where $\Psi _0$ is the GS wavefunction for an N-electron quantum system, $\Psi _{\mathcal D}$ is a linear combination of eigenfunctions in $\mathcal {D}$ of $\Psi _0$, and $0\le \beta \le 1$. Both $\Psi _0$ and $\Psi _{\mathcal D}$ are normalized to 1. The corresponding path density is

$$\begin{aligned} \rho _{p}(\mathbf {r})= & {} N\langle \Psi _{p}|\Psi _{p}\rangle _{N-1} \nonumber \\= & {} (1-\beta ^2)N\langle \Psi _{0}|\Psi _{0}\rangle _{N-1} + \beta ^{2}N\langle \Psi _{\mathcal D}|\Psi _{\mathcal D}\rangle _{N-1} \nonumber \\= & {} (1-\beta ^2)\rho _0(\mathbf {r}) + \beta ^2\rho _{\mathcal D}(\mathbf {r}) \ . \end{aligned}$$

(114)

When $\beta $ approaches 0, $\rho _{p}(\mathbf {r})$ also approaches $\rho _0(\mathbf {r})$. Letting $\beta $ changes continuously from 1 to 0, we obtain the desired density variational path. Equation (114) shows that the path density is automatically normalized to N, therefore the density variation stays within the normalized space. Clearly, $\rho _{p}(\mathbf{r})$ lies in the neighborhood of $\rho _{0}(\mathbf{r})$ within $\mathcal {J}_N$. For convenience, we label $\mathcal {B}_N$ as the set of all legitimate N-representable $\rho _{p}(\mathbf{r})$ defined for a given $\Psi _{0}$ or $\rho _{0}(\mathbf{r})$ in Eq. (114).

A trial wavefunction is then assumed to yield the same path density:

$$\begin{aligned} \widetilde{\Psi } = \sqrt{1-\beta ^2}\Psi _0 + \lambda \Psi _t = \sqrt{1-\beta ^2}\Psi _0 + \lambda \sum _{i=0}^{\infty }c_i\Psi _i\longmapsto \rho _{p}(\mathbf {r}) \ , \end{aligned}$$

(115)

where $\Psi _i$ is the ith normalized eigenfunction of $\hat{H}$, $\langle \Psi _t|\Psi _t\rangle =1$, and the expansion coefficients $\{c_i\}$ are chosen to be real. The complete set of $\{\Psi _i\}$ can be divided into three parts: $\Psi _0$, $\mathcal S$, and $\mathcal D$. The electron density (the trial density) for $\widetilde{\Psi }$ takes the following form:

$$\begin{aligned} \widetilde{\rho }(\mathbf{r})= & {} N\langle \widetilde{\Psi }|\widetilde{\Psi }\rangle _{N-1}\nonumber \\= & {} (1-\beta ^2)N\langle \Psi _{0}|\Psi _{0}\rangle _{N-1}+\lambda ^{2}N\langle \Psi _{t}|\Psi _{t}\rangle _{N-1} + 2\lambda \sqrt{1-\beta ^2} N Re\!\left( \langle \Psi _{0}|\Psi _{t}\rangle _{N-1}\right) \nonumber \\= & {} (1-\beta ^2)\rho _{0}(\mathbf{r})+\lambda ^{2}\rho _{t}(\mathbf{r}) + 2\lambda \sqrt{1-\beta ^2} N Re\!\left( \langle \Psi _{0}|\Psi _{t}\rangle _{N-1}\right) \ . \end{aligned}$$

(116)

At any point, the trial density is identical to the path density to ensure that the density variation is actually along the path we designed:

$$\begin{aligned} \widetilde{\rho }(\mathbf{r})=\rho _{p}(\mathbf{r})\rightarrow \rho _{0}(\mathbf{r})\ . \end{aligned}$$

(117)

Therefore, we have

$$\begin{aligned} \left\langle \widetilde{\rho }(\mathbf{r})\right\rangle =\left\langle \rho _{p}(\mathbf{r})\right\rangle \ . \end{aligned}$$

(118)

Substituting Eqs. (114) and (116) into Eq. (118) and simplifying the result, one derives

$$\begin{aligned} \beta ^{2}=\lambda ^2 + 2\lambda c_0 \sqrt{1-\beta ^2} \ . \end{aligned}$$

(119)

At one specific point on the variational path, the value of $\beta $ is fixed, we can solve $\lambda $ in terms of $\beta $ based on Eq. (119):

$$\begin{aligned} \lambda =-c_0 \sqrt{1-\beta ^2}\pm \sqrt{c_0^2 (1-\beta ^2) + \beta ^2} \ . \end{aligned}$$

(120)

Near the end of the variational path, when $\beta \rightarrow 0$ and $c_0 \ne 0$,

$$\begin{aligned} \lambda \rightarrow -c_0 \sqrt{1-\beta ^2} \pm \left[ c_0 \sqrt{1-\beta ^2} + \frac{1}{2 c_0}\beta ^2 + \cdots \right] \ . \end{aligned}$$

(121)

Again (see Appendix 3), the positive sign is chosen in Eq. (121), and we have

$$\begin{aligned} \lambda \rightarrow \frac{1}{2 c_0}\beta ^2 + \cdots ,\;\;\text {as}\;\beta \rightarrow 0 \ . \end{aligned}$$

(122)

Immediately, we can conclude that towards the end of variational path, $\lambda $ is of the same magnitude of $\beta ^2/c_0$. In other words, $\lambda $ also approaches zero at nearly the same rate as $\beta ^2/c_0$ approaches zero.

Because of Eqs. (114), (116), and (117), we obtain

$$\begin{aligned} \lambda ^{2}\rho _{t}+ 2N\lambda \sqrt{1-\beta ^2} Re\!\left( \left\langle \left. \Psi _{0}\right| \!\Psi _{t}\right\rangle _{N-1}\right) =\beta ^{2}\rho _{\mathcal {D}}\ . \end{aligned}$$

(123)

Substituting Eq. (119) into Eq. (123) yields

$$\begin{aligned} 2\sqrt{1-\beta ^2}\left[ c_{0} \rho _{\mathcal {D}}-N\sum _{i}^{\mathcal {S}_{0}}c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \right] = \lambda (\rho _{t}-\rho _{\mathcal {D}})\;, \end{aligned}$$

(124)

where the summation on the LHS is only within $\mathcal {S}_0$. At $\beta \rightarrow 0$, we find that the coefficients $\{c_i\}$ for $\Psi _i \in {\mathcal S}_0$ are linear in $\lambda $.

After knowing the property of $\{c_i\}$ for wavefunctions in $\mathcal {S}_0$, we then investigate other remaining $\{c_i\}$ for wavefunctions in $\mathcal {D}$. At one particular point on the variational path ($\beta $ fixed), we optimize trial wavefunction to find out the set of coefficients $\{c_i\}$ that yields the lowest energy for

$$\begin{aligned} \langle \widetilde{\Psi }|\hat{H}|\widetilde{\Psi }\rangle= & {} \left\langle \! \sqrt{1-\beta ^2}\Psi _0+\lambda \Psi _{t}\left| \hat{H}\right| \sqrt{1-\beta ^2}\Psi _0+\lambda \Psi _{t}\!\right\rangle \nonumber \\= & {} E_0 -\left[ \beta ^2 - 2\lambda c_0 \sqrt{1-\beta ^2}\right] E_0 + \lambda ^2\langle \Psi _t|\hat{H}|\Psi _t\rangle \nonumber \\= & {} E_0 - \lambda ^2 E_0 + \lambda ^2\langle \Psi _t|\hat{H}|\Psi _t\rangle = E_0 + \lambda ^2\left( \langle \Psi _t|\hat{H}|\Psi _t\rangle - E_0 \right) \ , \end{aligned}$$

(125)

where Eq. (119) has been used to simplify the expression after the second equal sign. Obviously, we only need to minimize the last term in Eq. (125) under the following two constraints:

$$\begin{aligned} \sum _{i=0}^{\infty } c_{i}^{2}=1\;, \end{aligned}$$

(126)

and

$$\begin{aligned} \widetilde{\rho }(\mathbf{r})=\rho _{p}(\mathbf r)\ . \end{aligned}$$

(127)

The density constraint, Eq. (127), is equivalent to the following identity based on our previous analysis:

$$\begin{aligned}&2\lambda \sqrt{1-\beta ^2}\left[ \frac{c_{0}(\rho _{0}-\rho _{\mathcal {D}})}{N}+\sum _{i}^{\mathcal {S}}c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \right] \nonumber \\&\ \ \ = \lambda ^{2}\left( \frac{\rho _{\mathcal {D}}}{N}-\sum _{i,j}^{\infty }c_{j} c_{i}\left\langle \left. \Psi _{j}\right| \!\Psi _{i}\right\rangle _{N-1}\!\right) \ . \end{aligned}$$

(128)

We will use the Euler-Lagrange multiplier method to find the set of coefficients $\{c_{i}\}$ that minimizes the value of $\lambda ^{2}\left( \langle \Psi _{t}|\hat{H}|\Psi _{t}\rangle -E_{0}\right) $. Let

$$\begin{aligned} \mathbf{A}=\sum _{i=0}^{\infty } c_{i}^{2}\ -1\;, \end{aligned}$$

(129)

$$\begin{aligned} \mathbf{B}= & {} 2\lambda \sqrt{1-\beta ^2}\left[ \frac{c_{0}(\rho _{0}-\rho _{\mathcal {D}})}{N}+\sum _{i}^{\mathcal {S}} c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \!\right] \nonumber \\&-\lambda ^{2}\left( \frac{\rho _{\mathcal {D}}}{N}-\sum _{i,j}^\infty c_{i} c_{j}\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\!\right) \;,\end{aligned}$$

(130)

and

$$\begin{aligned} \mathbf{\mathbf{\Omega }}= & {} \lambda ^{2}\left( \langle \Psi _{t}|\hat{H}|\Psi _{t}\rangle -E_{0}\right) -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \nonumber \\= & {} \lambda ^{2}\left( \sum _{i,j}^{\infty }c_{i} c_{j}\langle \Psi _{i}|\hat{H}|\Psi _{j}\rangle -E_{0}\right) -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \nonumber \\= & {} \lambda ^{2}\left( \sum _{i,j}^{\infty }c_{i} c_{j}E_{j}\delta _{ij}-E_{0}\right) -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \nonumber \\= & {} \lambda ^{2}\left[ \sum _{i=1}^{\infty } c_{i}^{2}(E_{i}-E_{0})\right] -h\mathbf{A}-\langle g(\mathbf{r})\,\mathbf{B}\rangle \ , \end{aligned}$$

(131)

where h and $g(\mathbf{r})$ are the Lagrange multipliers corresponding to the two constraints in Eqs. (129) and (130). Minimizing Eq. (131) with respect to $\{c_{i}\}$, one obtains

$$\begin{aligned} \lambda \left\langle \!\left[ \frac{\sqrt{1-\beta ^2}(\rho _{0}-\rho _{\mathcal {D}})}{N}+\lambda \sum _{j=0}^{\infty }c_{j}Re\! \left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!\right] g(\mathbf{r})\!\!\right\rangle =-h c_{0}\ , \end{aligned}$$

(132)

and

$$\begin{aligned}&\lambda \left\langle \! \left[ \sqrt{1-\beta ^2}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \Psi _{0}\right\rangle _{N-1}\right) +\lambda \sum _{j=0}^{\infty }c_{j}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!\right] g(\mathbf{r})\!\!\right\rangle \nonumber \\&\;\;\;\;=\left[ \lambda ^2(E_i - E_0) - h \right] c_{i}\;\;\;\;\;\;\;\;\; (\text {for}\;i\ne 0)\ . \end{aligned}$$

(133)

Because $c_0$ is linear in $\lambda $ as we previously showed, we can readily infer from Eq. (132) that $g(\mathbf{r})$ must take the following form:

$$\begin{aligned} g_{\scriptstyle {\lambda }}(\mathbf{r})=g^{(0)}(\mathbf{r})+\sum _{k=1}^{\infty }\frac{g^{(k)}(\mathbf{r})}{k!}\lambda ^{k}\ . \end{aligned}$$

(134)

Substituting Eq. (134) into Eq. (133) and ignoring the higher-order terms as $\lambda \rightarrow 0$, we obtain an equation for $\Psi _i\in {\mathcal S}$,

$$\begin{aligned} - hc_{i}=\lambda \sqrt{1-\beta ^2}\left\langle Re\! \left( \!\left\langle \left. \Psi _{i}\right| \Psi _{0}\right\rangle _{N-1}\right) g^{(0)}(\mathbf{r})\!\right\rangle + h.o. \ , \end{aligned}$$

(135)

where “h.o.” denotes higher-order terms in $\lambda $. Therefore, we reach the same conclusion as before: $\{c_i\}$ for $\Psi _i \in \mathcal S$ is linear in $\lambda $ towards the end of variational path. For those $\{c_i\}$ for $\Psi _i \in \mathcal D$, utilizing the additional fact that $\Psi _i$ is order-1 strongly orthogonal to $\Psi _0$, we can further simplify Eq. (133) to

$$\begin{aligned}&\lambda ^2\left\langle \! \sum _{j}^{\mathcal {S}_{0}}c_{j}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!g(\mathbf{r})\!\!\right\rangle + \lambda ^2\left\langle \! \sum _{j}^{\mathcal D}c_{j}Re\! \left( \!\left\langle \left. \Psi _{i}\right| \!\Psi _{j}\right\rangle _{N-1}\right) \!g(\mathbf{r})\!\!\right\rangle \nonumber \\&\;\;\;\;=\lambda ^2(E_i - E_0) c_i - h c_{i} \ . \end{aligned}$$

(136)

For this equation to be valid at $\lambda \rightarrow 0$, the LHS and the RHS must have the same dependence on $\lambda $. On the RHS, the first term decays faster than the second term, and the second term will dominate when $\lambda $ approaches 0. Therefore, we must match the magnitude of the second term on the RHS to the LHS. Of course, we cannot match it with the second term on the LHS because doing so will lead to self inconsistency. Then, the second term on the RHS must decay in the same way as the first term on the LHS. Thus, $\{c_i\}$ for $\Psi _i \in \mathcal D$ are proportional to $\lambda ^3$. Unfortunately, such a $\lambda ^{3}$-behavior is contradictory to the normalization constraint in Eq. (126), because $\sum _i c_{i}^{2}$ will become 0 as $\lambda \rightarrow 0$. Hence, we conclude that this contradiction must come from the assumption: ${\Psi _{t}=\sum _i c_{i}\Psi _{i}}$, where the expansion is over the complete set of eigenfunctions of $\hat{H}$.

To resolve the contradiction, we have to modify our assumption about the expansion of $\Psi _{t}$. We notice that if the summation $\sum _{i}c_{i}\Psi _{i}$ includes any wavefunction from $\mathcal {S}_{0}$, the same problem will persist. Therefore, $\Psi _{t}$ can only be expanded in $\mathcal {D}$,

$$\begin{aligned} \Psi _{t}=\sum _{i}^{\mathcal {D}}c_{i}\Psi _{i}^{\mathcal {}}\ . \end{aligned}$$

(137)

In this case, Eq. (127) is equivalent to

$$\begin{aligned} \lambda ^{2}\rho _{t}(\mathbf{r})=\beta ^{2}\rho _{\mathcal {D}}(\mathbf{r})\;. \end{aligned}$$

(138)

Integrating both sides of Eq. (138) over the entire space, one obtains

$$\begin{aligned} \lambda ^{2}=\beta ^{2}\;, \end{aligned}$$

(139)

which further ensures that

$$\begin{aligned} \rho _{t}(\mathbf{r})=\rho _{\mathcal {D}}(\mathbf{r})\;. \end{aligned}$$

(140)

Now, the original minimization process is reduced to minimizing the following term,

$$\begin{aligned} {\varvec{\Xi }} = \left( \sum _{i}^{\mathcal {D}}|c_{i}|^{2}E_{i}\right) -h\left( \sum _{i}^{\mathcal {D}}|c_{i}|^{2}-1\right) -\left\langle \!\!\left( \frac{\rho _{\mathcal {D}}}{N}-\sum _{i,j}^{\mathcal {D}}c_i^* c_j\langle \Psi _i|\Psi _j\rangle _{N-1}\right) g(\mathbf{r})\!\!\right\rangle .\nonumber \\ \end{aligned}$$

(141)

Suppose this minimization will yield the optimal set of expansion coefficients, $\{\bar{c}_i\}$, which have no dependence on $\lambda $ and $\beta $ from the appearance of Eq. (141). Then, we have

$$\begin{aligned}&\underset{\scriptstyle {\Psi _{0}+\delta \Psi _{0}\rightarrow \rho _{0}+\delta \rho }}{\inf }\frac{\langle \delta \Psi |\hat{H}-E_{0}|\delta \Psi \rangle }{||\delta \rho ||} = \underset{\scriptstyle {\Psi _{0}+\delta \Psi \rightarrow \rho _{0}+\delta \rho }}{\inf }\frac{\langle \widetilde{\Psi }-\Psi _0|\hat{H}-E_{0}|\widetilde{\Psi }- \Psi _0\rangle }{||\rho _{p}-\rho _0||}\nonumber \\&\;\;\;\;\;\;\;= \underset{\scriptstyle {\Psi _{t}\rightarrow \rho _{\mathcal {D}}}}{\inf }\frac{\left\langle (\sqrt{1-\beta ^2}-1)\Psi _0 + \lambda \Psi _{t}\left| \hat{H}-E_{0}\right| (\sqrt{1-\beta ^2}-1)\Psi _0 + \lambda \Psi _{t}\right\rangle }{||\beta ^{2}(\rho _{\mathcal {D}}-\rho _0)||}\nonumber \\&\;\;\;\;\;\;\;= \underset{\scriptstyle {\Psi _{t}\rightarrow \rho _{\mathcal {D}}}}{\inf }\frac{\lambda ^{2}\langle \Psi _{t}|\hat{H}-E_{0}|\Psi _{t}\rangle }{\beta ^{2}||\rho _{\mathcal {D}}-\rho _0||} = \frac{\underset{\scriptstyle {\Psi _{t}\rightarrow \rho _{\mathcal {D}}}}{\inf }\langle \Psi _{t}|\hat{H}-E_{0}|\Psi _{t}\rangle }{||\rho _{\mathcal {D}}-\rho _0||}\nonumber \\&\;\;\;\;\;\;\; =\frac{1}{||\rho _{\mathcal {D}}-\rho _0||}\displaystyle \left\langle \!\!\left. \sum _{i}^{\mathcal {D}}\bar{c}_i\Psi _{i}\right| \hat{H}-E_{0}\left| \sum _{j}^{\mathcal {D}}\bar{c}_j\Psi _{j}\right. \!\!\right\rangle = \frac{1}{||\rho _{\mathcal {D}}-\rho _0||} \left[ \sum _{i}^{\mathcal {D}}|\bar{c}_i|^2 E_i - E_0\right] \nonumber \\&\;\;\;\;\;\;\;> \frac{1}{||\rho _{\mathcal {D}}-\rho _0||} \left[ \sum _{i}^{\mathcal {D}}|\bar{c}_i|^2 E_0 - E_0\right] = 0 \ , \end{aligned}$$

(142)

where Eq. (139) is used to simplify the expression after the third equal sign. Evidently, Eq. (142) suggests that the condition for Fréchet differentiability proposed by Lindgren and Salomonson [8,9,10] is not fulfilled. In other words, the Fréchet derivative does not exist in the normalized density domain $\mathcal {J}_N$ either.

Appendix 3

In this appendix, we analyze the consequence of choosing the negative sign in Eqs. (41) and (121). In the end, we will conclude that this particular choice is fully equivalent to the more natural decision made in the main text and Appendix 2.

Let us start from a unified version of Eqs. (40) and (120):

$$\begin{aligned} \lambda =-a c_0\pm \sqrt{a^2 c_0^2 + \beta ^2} \ , \end{aligned}$$

(143)

where constant a = 1 and $\sqrt{1-\beta ^2}$ in the main text and Appendix 2, respectively. Obviously, if $c_0 = 0$ or $c_0 \rightarrow 0$ as $\beta \rightarrow 0$, both $\lambda $ and $\beta $ approach 0 concurrently near the end of the variational path.

We only need to further examine the situation when $c_0 \ne 0$ as $\beta \rightarrow 0$ with the choice of the negative sign in Eq. (143):

$$\begin{aligned} \lambda = -2a c_0 - \lambda ' \ , \end{aligned}$$

(144)

where the residual term $\lambda '$ approaches 0 as $\beta \rightarrow 0$:

$$\begin{aligned} \lambda ' = \left[ \frac{1}{2ac_0}\beta ^2 - \frac{1}{8 a^3 c_0^3}\beta ^4 + \cdots \right] \rightarrow 0 \ . \end{aligned}$$

(145)

Consequently, Eqs. (46) and (124) can be rewritten as

$$\begin{aligned} 2a\!\left[ c_{0}(\rho _{0}-\rho _{t})+N\sum _{i}^{\mathcal {S}} c_{i} Re\!\left( \!\left\langle \left. \Psi _{0}\right| \!\Psi _{i}\right\rangle _{N-1}\right) \right] =\lambda '\left( \rho _{t}-\rho _{\mathcal {D}}\right) \ , \end{aligned}$$

(146)

which immediately suggests that as $\beta \rightarrow 0$, $(\rho _{0}-\rho _{t})$ and the coefficients $\{c_i\}$ for $\Psi _i \in {\mathcal S}$ are linear in $\lambda '$. Because $\rho _{t} \rightarrow \rho _{0}$, $\Psi _t \rightarrow c_0 \Psi _0$ with $|c_0| \rightarrow 1$, as $\beta \rightarrow 0$. Therefore, at the end of the variational path ($\beta =0$ and $\lambda ' = 0$), $\lambda = -2ac_0$, $\Psi _t = c_0 \Psi _0$, $|c_0| = 1$, and $\widetilde{\Psi } = -a \Psi _0$.

Evidently, the choice of the negative sign in Eqs. (41) and (121) yields a fully equivalent, alternative trial wavefunction,

$$\begin{aligned} \widetilde{\Psi }' = -a \Psi _0 - \lambda '\, \Psi _t \ , \end{aligned}$$

(147)

where $\lambda ' \rightarrow 0$ as $\beta \rightarrow 0$. Then, we can carry out the discussion on the basis of $\lambda ' \rightarrow 0$ instead.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiang, P., Wang, Y.A. (2018). Functional Derivatives and Differentiability in Density-Functional Theory. In: Wang, Y., Thachuk, M., Krems, R., Maruani, J. (eds) Concepts, Methods and Applications of Quantum Systems in Chemistry and Physics. Progress in Theoretical Chemistry and Physics, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-74582-4_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-74582-4_18
Published: 18 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74581-7
Online ISBN: 978-3-319-74582-4
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics