Keywords

1 Introduction

Password Authenticated Key Exchange (PAKE) is a primitive that allows two or more users that start only from a low-entropy shared secret – which is a typical user authentication setting today – to agree on the cryptographically strong session key. Since the introduction of PAKE in 1992, a plethora of protocols trying to achieve secure PAKE has been proposed. However, due to patent issues, only recently have PAKEs begun to be considered for a wide-scale use: SRP [28] has been used in password manager called 1Password [2], J-PAKE of Hao and Ryan [14] was used in Firefox Sync [12], while Elliptic Curve (EC) version of the same protocol (EC-J-PAKE [11]) has been used to enable authentication and authorization for network access for Internet-of-Things (IoT) devices under the Thread network protocol [13].

From a deployment perspective, the most significant advantage of using PAKE compared to a typical key exchange protocol is that it avoids dependence on functional Public Key Infrastructure (PKI). On the downside, the use of low-entropy secret as the primary means of authentication comes with the price: PAKEs are inherently vulnerable to online dictionary attacks. To mount this attack, all an adversary needs to do is repeatedly send candidate passwords to the verifying server to test for their validity. In practice, this type of attack can be relatively easily avoided in a two-party setting by limiting the number of guesses (i.e., wrong login attempts) that can be made in a given time frame.

At the same time, a well-designed PAKE must be resistant against offline dictionary attacks. In such attack scenario, the adversary typically operates in two phases: in the first (usually online) phase, the adversary – either by eavesdropping or impersonating a user – tries to collect a function of the password that is being targeted to serve him as the password verifier. Later, in the second (offline) phase, the adversary has to correlate the verifier that has been collected in the first step with offline password guesses to determine the correct password.

Concerning of design, PAKEs can follow a symmetric or asymmetric approach concerning the value that is used as an authenticator. For instance, the first PAKE to be proposed, EKE [6], follows symmetric design strategy: Both client and server are required to know their joint password in clear to successfully run the EKE protocol. Such protocols are usually called balanced PAKEs. Over time it has been realized that the risk of losing a large number of passwords in case of a server compromise increases if passwords are kept in the clear. Damage inflicted from such loss could be very high, especially today when most people typically use many online services while authenticating with only a few related passwords.

One way to mitigate such treat is to use asymmetrically designed PAKE, also known as augmented PAKEFootnote 1. This type of PAKE guarantees that the password is not stored on the server side as plaintext, but, in fact, as an image of the password. Nevertheless, for long it has been argued, from a theoretical perspective, that augmented PAKEs do not add much benefit over balanced PAKEs, since the brute-force attack on a stolen password file (a list containing password hashes) would quickly yield a number of underlying passwords. With the introduction of sequential memory-hard hash functions such as Scrypt [25] and Argon2 [8] and use of salt, which can be used to slow down password cracking significantly, this may not be the case anymore.

1.1 Our Contribution

Recently, Mochetti, Resende and Aranha [23] proposed (without exhibiting a security proof) a simple augmented PAKE called zkPAKE, which they claim is suitable for banking applications, requiring the server to store only the image of a password under a one-way function. Their main idea was to use zero-knowledge proof of knowledge (password) to design an efficient PAKE. However, here we present an offline dictionary attack against the zkPAKE protocol. In addition, we show that the same attack works on a slight variant of zkPAKE that has been proposed later in [24]. We also provide a prototype and share the benchmarks of the attack to demonstrate its feasibility. Our dictionary attack can be carried out in two ways: passively - by eavesdropping on the zkPAKE protocol execution, or actively - by impersonating the server and having the client attempt to log in.

1.2 Previous Works

Password Authenticated Key Exchange was introduced by Bellovin and Meritt [6] in 1992. Their EKE protocol was first to show that it is possible to bootstrap a low-entropy string into a strong cryptographic key. A few years later, Jablon proposed an alternative - the SPEKE protocol [18]. Over the next 25 years, plenty more PAKE proposals have surfaced [4, 14, 21, 22]. In parallel, augmented versions of different PAKEs were introduced (e.g. A-EKE [7], B-SPEKE [19]). As explained above, augmented PAKEs have an additional security property compared to balanced PAKEs: if implemented well, it is considered to be more resistant to server compromise in a sense that clients’ passwords are not immediately revealed once the password file is leaked since the attacker still has to perform password cracking. Finally, a number of them have been standardized in IEEE [16], IETF [15] and ISO [17].

Security of early PAKE proposals was argued only informally by showing that a protocol can withstand all known attacks. Starting from 2000, the two formal models of security for PAKE appeared in [5] and [9]. More specifically, following a game-based approach Bellare, Pointcheval and Rogaway have argued in [5] that a provably secure PAKE protocol must provide the indistinguishability of the session key and satisfy the authentication property. The Real-or-Random (RoR) variant of their model from [3], along with the Universally Composable PAKE model from [10] are considered to be state-of-the-art models that rigorously capture PAKE security requirements.

Since we exclusively deal with an offline dictionary attack on zkPAKE, in this paper, we keep the discussion here short and refer readers to Pointcheval’s survey [26] for a more detailed overview of PAKE research field.

1.3 Organization

The rest of the paper is organized as follows. Section 2 describes the zkPAKE protocol and its variant. In Sect. 3, we present an offline dictionary attack against both variants of the zkPAKE protocol. Finally, we conclude the paper in Sect. 4.

2 The zkPAKE Protocol

In this section, we review the zkPAKE protocol. We will start with the variant of zkPAKE from [24] whose description is presented in Fig. 1, and then point out the differences with the original design from [23]. The reason for this order of presentation is because the variant of zkPAKE that is proposed later is slightly more elaborate than the original zkPAKE, so we want to show that zkPAKE does not stand against our attack even with proposed modifications.

2.1 Protocol Description

zkPAKE, as described in [24], is a two-party augmented PAKE protocol meant to provide authenticated key exchange between a server S and a client C.

Initialization Phase. The protocol starts with an enrollment phase, which is executed for every client only once. In this phase, a client and a server (e.g., bank) share a secret value of low entropy that can be remembered by the client. More specifically, in case of zkPAKE, the client must remember the password \(\pi \), while the server only stores an image of the password R. Before the server computes the corresponding image R, public parameters must be chosen and agreed on: (1) a finite cyclic group \(\mathbb {G}\) of prime order q and a random generator g of the group \(\mathbb {G}\); (2) Hash functions \(H_1\) and \(H_2\) whose outputs are k-bit strings, where k is the security parameter representing the length of session keys.

Protocol Execution. Once the enrollment phase is executed and the public parameters are established, the zkPAKE protocol (see Fig. 1) will run in three communication rounds as follows:

  1. 1.

    First, the server S chooses a random value n from \(\mathbb {Z}_q\), computes N that is supposed to act both as a nonce and Diffie-Hellman value, and sends it to the client C.

  2. 2.

    Now, upon receiving the nonce N, the client C inputs his password, computes the hash of the password - r, chooses a random element v from \(\mathbb {Z}_q\), and computes \(t:=N^v\). Then, C computes \(c := H_1(g, g^r, t, N)\) and obtains \(u:= v - H_1(c)r\) that should lie in \(\mathbb {Z}_q\). Next, C computes the session key \(sk_c := H_2(c)\) and sends u and \(H_1(c)\) to the S.

  3. 3.

    Upon receiving \(H_1(c)\) and u, S recovers \(t'\) by computing \(g^{un}R^{nH_1(c)}\). Then, S calculates \(c' := H_1(g, R, t', N)\). Next, S checks if \(H_1(c')\) echoes \(H_1(c)\). If it does, S computes the session key \(sk_s := H_2(c')\) and sends \(H_1(sk_s)\) to C. Otherwise; it aborts the protocol.

  4. 4.

    Similarly, upon receiving \(H_1(sk_s)\), C checks if \(H_1(sk_s)\) and \(H_1(sk_c)\) match. If values are equal, C saves computed session key \(sk_c\) and terminates.

Fig. 1.
figure 1

The zkPAKE protocol.

As we said before, the authors of zkPAKE have presented two variants of it. The original proposal from [23] differs from the follow-up version in two places: Nonce N is left underspecified, and value t on the client side is computed without involving received nonce. This difference also affects the computation of \(t'\) from the server side. In more details, the original zkPAKE protocol runs as follows:

  1. 1.

    The server sends his nonce N to the client C.

  2. 2.

    The client calculates the hash of his password r, chooses a random parameter \({v} \leftarrow \mathbb {Z}_q\), and computes \(t := g^v\). Then, C computes \(c := H_1(g, g^r, t, N)\) and obtains \(u:= v-H_1(c)r\) in \(\mathbb {Z}_q\). Next, C computes the session key \(sk_c := H_2(c)\) and sends u and \(H_1(c)\) to the S.

  3. 3.

    Upon receiving \(H_1(c)\) and u, S recovers \(t'\) by computing \(g^{u}R^{H_1(c)}\). Then, S calculates \(c' := H_1(g, R, t', N)\). Next, S checks if \(H_1(c')\) echoes \(H_1(c)\). If it does, S computes the session key \(sk_s := H_2(c')\) and sends \(H_1(sk_s)\) to C. Otherwise, he aborts the protocol.

  4. 4.

    Finally, upon receiving \(H_1(sk_s)\), C checks if \(H_1(sk_s)\) echoes \(H_1(sk_c)\). If values are equal, C saves computed session key \(sk_c\) and terminates.

3 Offline Dictionary Attack on zkPAKE

In the next section, we will show how both variants of the zkPAKE protocol are vulnerable to an offline dictionary attack. Our attack exploits the fact that r, which is a hash of clients password, is of low entropy.

3.1 Attack Description

Let the enrollment phase be established and let an attacker \(\mathcal {A}\) be allowed only to eavesdrop on the communication between two honest parties. The attack on the version of zkPAKE protocol presented in Fig. 1 proceeds as follows:  

Step 1. :

The execution of the protocol starts and S sends his first message, N. The attacker \(\mathcal {A}\) sees the message and stores it in his memory.

Step 2. :

C does all the computations demanded by the protocol and sends u and \(H_1(c)\) in the second transmission to S. \(\mathcal {A}\) observes the second message and obtains u and \(H_1(c)\).

Step 3. :

The adversary that now holds N, u and \(H_1(c)\) from the first two message rounds may go offline to perform a dictionary attack. His goal is to compute a candidate \(c'\) and then use stored \(H_1(c)\) as a verifier. The adversary will compute \(c'\) by hashing \(H_1(g, g^{r}, t', N)\). Two intermediate inputs to the hash function are obtained by first choosing a candidate password \(\pi \), and then computing the corresponding r and \(t'\). Note that the adversary can easily compute \(t' = N^{v}\), since \(v := u + H_1(c) r\). Finally, the adversary checks if his guess \(H_1(c')\) echoes \(H_1(c)\).

Step 4. :

The adversary repeats Step 3 until he guesses the correct password. As for the original zkPAKE protocol, the same attack works in a very similar way: Steps 1, 2 and 4 are the same while in Step 3 we need to make a minor change:

Step 3a. :

The adversary that now holds N, u and \(H_1(c)\) from the first two message rounds may go offline to perform a dictionary attack. Same as above, the adversary aims to obtain candidate \(c'_i\) by computing a hash \(H_1(g, g^{r_i}, t'_i, N)\). Here the only difference is that \(t'_i = g^{v_i}\), while the formula for computing \(v_i\) stays the same.

 

Note that one can mount a similar dictionary attack by impersonating a server. In this case, the only difference with the eavesdropping attack described above is the attacker picks the value of the nonce N. Such knowledge, however, does not additionally help the adversary in our attack. Once the adversary receives clients reply, he can continue with Steps 3 and 4 from the eavesdropping attack.

3.2 Attack Implementation

We implemented a prototypeFootnote 2 in Python 3 to simulate the attack described above. Our simulation consists of two steps: in the first step, a password is randomly chosen from one of three fixed dictionaries that vary in size and the zkPAKE protocol is executed between two honest parties. Then, in the second step (see Algorithm 1), the adversary is given access to honestly generated values as described in Sect. 3.1. With this information in hand, the adversary can easily perform an offline dictionary attack against chosen password.

figure a

We performed a set of experiments, using a 224-bit subgroup of a 2048-bit finite field Diffie-Hellman groupFootnote 3, to determine the time it takes to complete an offline dictionary attack depending on the size of a selected dictionary. Each set of experiments involved mounting the attack by enumerating dictionaries that contain 1000, 10000, and 100000 random password elements. Each experiment was performed 50 times.

Results. The times it took the adversarial algorithm described above to find a matching password for each given dictionary are summarized in Table 1.

Table 1. Results for different dictionary sizes

Our results demonstrate that there is a linear relationship between the size of the dictionary and the average time to find a matching password, and shows that an attack is feasible for any adversary with even a small computational powerFootnote 4. As expected, the total time for cracking a 100000 password-size dictionary is less than 5 min, and thus we conclude that the attack would be feasible for dictionaries with significantly more elements. We also note that there are more powerful tools to create more efficient dictionaries, such as HashCat [27] or John the Ripper [1], which would make the offline search more effective.

4 Conclusion

In this paper, we showed that both versions of the zkPAKE protocol [23, 24] are vulnerable to offline dictionary attacks. To make matters worse, the adversary in case of zkPAKE only needs eavesdropping capabilities to mount the attack.

By taking a wider view on zkPAKE, the problem with its design lies in a fact that variable r, which is of low-entropy, is used as a mask for the secret value v. In contrast, in a typical zero-knowledge proof of knowledge, which was used as an inspiration for zkPAKE design, such value is of high entropy. By showing this vulnerability, we hope that in future protocol designers will be more careful in claiming the security of proposed protocols, especially when a proof of security does not back those claims. Since zkPAKE protocol core design is flawed beyond repair and there already exist many mature PAKE alternatives, we do not pursue further study to improve on the zkPAKE protocol.