Verifiable Cloud Computing
Verifiable cloud computing is a way to provide cloud computing services that outsource computing to untrusted third parties while maintaining the integrity of the computation results.
In cloud computing, the data owner outsources the data storage and query services to a cloud service provider in order to scale up the services with a low cost. However, such an outsourcing model brings about serious issues in computation integrity. As the service provider is not the real owner of the data, it might return incomplete or incorrect results, intentionally or unintentionally. Thus, to ensure computation integrity, the client needs to authenticate the soundness (every result originates from the data owner’s database), the completeness (no valid result is missing), and the freshness (the result is up-to-date) of the computation results. To tackle this problem, one early solution is verification by replication (Haeberlen et al., 2007). The idea goes as follows. The client outsources the same computing task to multiple workers of different service providers. If not enough results are returned within a reasonable time or the results do not agree, the client would send the task to more other workers. Once a minimum number of the workers agree on the same computation results, the client assumes those results are correct. Although this solution is simple and effective, it assumes failure independence, which however does not hold when facing malicious service providers. In order to provide strong guarantee on the integrity of the computation results against untrusted or even malicious service providers, cryptographically enforced verifiable cloud computing is proposed and studied by a large body of literature. The basic idea is that the service provider should return not only the computation results but also a cryptographic proof, which can be used by the client to establish the soundness, completeness, and freshness of those results.
There are several metrics when it comes to evaluate the verifiable cloud computing protocols: (i) preprocessing time, which is the time for the data owner to generate some auxiliary data such as authenticated data structure; (ii) proving time, which is the time for the service provider to compute the proof; (iii) verification time, which is the time for the client to verify the proof; and (iv) proof size. It is worth noting that the verification time and proof size should ideally be proportional to the size of the computation results and be independent to the size of the whole database. This is particularly important when the client is a mobile user connecting to the cloud through a wireless network.
In general, there are two fundamental approaches to support verifiable cloud computing, each with its own advantages and disadvantages. On the one hand, one can design a verifiable scheme specifically based on the computation task. This is often achieved by letting the data owner sign a well-designed authenticated data structure (ADS), based on which the service provider can construct corresponding proofs for the outsourced computation. This approach yields low overhead but supports only limited computation tasks. On the other hand, a verification scheme may model the computation task as a general Turing machine. As such, they can support arbitrary tasks at the expense of high and sometimes impractical overhead.
In comparison, the general-purpose verifiable cloud computing scheme does not assume any specific properties on the computation task. Instead, the computation task is present as a Boolean or arithmetic circuit, which is Turing complete and thus can be used for arbitrary cloud computation. As the Boolean or arithmetic circuit can be viewed as a serious of constraints on the internal computation state as well as the final results, it can be transformed into a so-called quadratic span program. By utilizing certain cryptographic primitives such as pairing, the equivalent program can be verified using a technique known as zk-SNARKs (Parno et al., 2013). In addition to the ability to authenticate arbitrary programs, zk-SNARKs is able to achieve extremely low overhead on verification. The verification time and proof size are both in constant. Further, it leaks no information beyond the computation result, which is crucial for the applications with privacy and confidentiality concerns. However, as a cost of its generality, zk-SNARKs yields high overhead on the preprocessing time and proving time. It is considered impractical to many real-world cloud computing problems. Nevertheless, many studies have been carried out to reduce its proving overhead. Ben-Sasson et al. (2014) propose a zk-SNARK variant that avoids hard-coding the computation program into its verification key and thus reduces the preprocessing cost. Zhang et al. (2017a) propose an interactive protocol for general-purpose SQL queries, whose cost is substantially lower than the original zk-SNARKs scheme. More recently, it has been proposed to model the computation task as a random access machine (RAM) program as opposed to a circuit (Braun et al., 2013; Ben-Sasson et al., 2013; Zhang et al., 2018). As a result, it can gain significant performance improvement for some computation tasks. For example, the size of a circuit implementing binary search on a sorted array is linear in the length of the array, whereas the complexity of a RAM program for binary search is only logarithmic.
Verifiable cloud computing is essential in every cloud computing environment that has security and trust concerns. For example, it is imperative for the application scenario where business intelligence executives make critical, million-dollar decisions such as investing in new businesses based on OLAP queries in the cloud. Similar requirements also exist in many other applications such as scientific research and government policy making.
- Ben-Sasson E, Chiesa A, Tromer E, Virza M (2014) Succinct non-interactive zero knowledge for a von Neumann architecture. In: Proceedings of the 23rd USENIX conference on security symposium, pp 781–796Google Scholar
- Braun B, Feldman AJ, Ren Z, Setty S, Blumberg AJ, Walfish M (2013) Verifying computations with state. In: Proceedings of the twenty-fourth ACM symposium on operating systems principles, pp 341–357Google Scholar
- Chen Q, Hu H, Xu J (2015) Authenticated online data integration services. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp 167–181Google Scholar
- Hu H, Xu J, Chen Q, Yang Z (2012) Authenticating location-based services without compromising location privacy. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, pp 301–312Google Scholar
- Hu H, Chen Q, Xu J (2013) VERDICT: privacy-preserving authentication of range queries in location-based services. In: 2013 IEEE 29th international conference on data engineering, pp 1312–1315Google Scholar
- Li F, Hadjieleftheriou M, Kollios G, Reyzin L (2006) Dynamic authenticated index structures for outsourced databases. In: Proceedings of the 2006 ACM SIGMOD international conference on management of data, pp 121–132Google Scholar
- Merkle RC (1989) A certified digital signature. In: Advances in cryptology – CRYPTO, pp 218–238Google Scholar
- Pang H, Mouratidis K (2008) Authenticating the query results of text search engines. In: Proceedings of the VLDB endowment, pp 126–137Google Scholar
- Pang H, Tan KL (2004) Authenticating query results in edge computing. In: Proceedings of the 20th international conference on data engineering, pp 560–571Google Scholar
- Papadopoulos D, Papadopoulos S, Triandopoulos N (2014) Taking authenticated range queries to arbitrary dimensions. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pp 819–830Google Scholar
- Parno B, Howell J, Gentry C, Raykova M (2013) Pinocchio: nearly practical verifiable computation. In: 2013 IEEE symposium on security and privacy (SP), pp 238–252Google Scholar
- Xu C, Xu J, Hu H, Au MH (2018b) When query authentication meets fine-grained access control: a zero-knowledge approach. In: Proceedings of the 2018 ACM SIGMOD international conference on management of data, pp 147–162Google Scholar
- Yang Y, Papadias D, Papadopoulos S, Kalnis P (2009a) Authenticated join processing in outsourced databases. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, pp 5–18Google Scholar
- Yiu ML, Lo E, Yung D (2011) Authentication of moving kNN queries. In: Proceedings of the 27th IEEE international conference on data engineering, pp 565–576Google Scholar
- Zhang Y, Katz J, Papamanthou C (2015) IntegriDB: verifiable SQL for outsourced databases. In: Proceedings of the 22Nd ACM SIGSAC conference on computer and communications security, pp 1480–1491Google Scholar
- Zhang Y, Genkin D, Katz J, Papadopoulos D, Papamanthou C (2017a) vSQL: Verifying arbitrary SQL queries over dynamic outsourced databases. In: IEEE symposium on security and privacy, pp 863–880Google Scholar
- Zhang Y, Katz J, Papamanthou C (2017b) An expressive (zero-knowledge) set accumulator. In: IEEE European symposium on security and privacy (EuroS&P), pp 158–173Google Scholar
- Zhang Y, Genkin D, Katz J, Papadopoulos D, Papamanthou C (2018) vRAM: faster verifiable ram with program-independent preprocessing. In: IEEE symposium on security and privacyGoogle Scholar