这是用户在 2024-5-1 18:17 为 https://ieeexplore.ieee.org/document/9201320 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
ESVSSE:实现高效、安全、可验证的可搜索对称加密

ESVSSE: Enabling Efficient, Secure, Verifiable Searchable Symmetric Encryption
ESVSSE:实现高效、安全、可验证的可搜索对称加密

Publisher: IEEE 出版商:IEEE
Zhenkui Shi; Xuemei Fu; Xianxian Li; Kai Zhu
史振奎;傅雪梅;李娴娴;朱凯
Open Access 开放获取
Under a Creative Commons License
根据知识共享署名许可协议

Abstract:Symmetric Searchable Encryption(SSE) is deemed to tackle the privacy issue as well as the operability and confidentiality in data outsourcing. However, most SSE schemes a...View more
Abstract: 摘要:
Symmetric Searchable Encryption(SSE) is deemed to tackle the privacy issue as well as the operability and confidentiality in data outsourcing. However, most SSE schemes assume that the cloud is honest but curious. This assumption is not always applicable. And even if some schemes supported verification, integrity or freshness checking in a malicious cloud, but the performance and security functionalities are not fully exploited. In this paper, we propose an efficient SSE scheme based on B+-Tree and Counting Bloom Filter (CBF) which supports secure verification, dynamic updating, and multi-user queries. Comparing with the previous state of the arts, we design the new data structure CBF to support dynamic updating and boost verification. We also leverage the timestamp mechanism in the scheme to prevent the malicious cloud from launching a replay attack. The new designed CBF is like a front-engine to save user s cost for query and verification. And it can achieve more efficient query and verification with negligible false positive when there is no value matching the queried keyword. The CBF supports efficient dynamic updating by combining Bloom Filter with a one-dimensional array that provides the counting capability. Furthermore, we design the authenticator for CBF. We adopt B+-Tree for it is widely used in many database engines and file systems. We also give a brief security proof of our scheme. Then we provide a detailed performance analysis. Finally, we evaluate our scheme through comprehensive experiments. The results are consistent with our analysis and show that our scheme is secure, and more efficient compared with the previous schemes with the same functionalities. The average performance can be improved by about 20 percent for both the cloud servers and users when the missing rate of the searching keywords is 20 percent. And the higher the missing rate is, the more the performance can be improved.
对称可搜索加密(SSE)被认为是解决数据外包中的隐私问题以及可操作性和保密性的方法。然而,大多数 SSE 方案假设云是诚实但好奇的。这种假设并不总是适用的。即使一些方案在恶意云中支持验证、完整性或新鲜性检查,但性能和安全功能并没有得到充分利用。在本文中,我们提出了一种基于 B+-Tree 和 Counting Bloom Filter(CBF)的高效 SSE 方案,支持安全验证、动态更新和多用户查询。与先前的技术水平相比,我们设计了新的数据结构 CBF,以支持动态更新并提升验证。我们还利用方案中的时间戳机制来防止恶意云发动重放攻击。新设计的 CBF 就像一个前置引擎,为查询和验证节省用户 的成本。当没有与查询关键字匹配的值时,它可以实现更高效的查询和验证,且误报率可以忽略不计。 CBF 通过将 Bloom Filter 与提供计数功能的一维数组相结合,支持高效的动态更新。此外,我们为 CBF 设计了认证器。我们采用 B+-Tree,因为它在许多数据库引擎和文件系统中被广泛使用。我们还对我们的方案进行了简要的安全性证明。然后我们提供了详细的性能分析。最后,我们通过全面的实验评估了我们的方案。结果与我们的分析一致,表明我们的方案与具有相同功能的先前方案相比既安全又更高效。当搜索关键字的缺失率为 20%时,云服务器和用户的平均性能可以提高约 20%。缺失率越高,性能提升越明显。
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 34, Issue: 7, 01 July 2022)
发表在:IEEE 知识与数据工程交易(卷:34,期:7,2022 年 7 月 1 日)
Page(s): 3241 - 3254 页码:3241 - 3254
Date of Publication: 21 September 2020
发表日期:2020 年 9 月 21 日
ISSN Information: ISSN 信息:
DOI: 10.1109/TKDE.2020.3025348
DOI:10.1109/TKDE.2020.3025348
Publisher: IEEE 发布者: IEEE
Funding Agency:  资助机构:

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
CCBY - IEEE 不是这份资料的版权持有者。请按照 https://creativecommons.org/licenses/by/4.0/ 上的说明获取完整的文章和 API 文档中的规定。
SECTION 1 第一部分

Introduction 简介

Cloud computing has been widely deployed and become indispensable for many companies and institutions, due to its salient features such as on-demand services, elasticity, flexibility, multi-tenancy and efficient access to their data and reduced the maintenance cost [1]. Various cloud-based services and applications are gradually enriched, including many governments, educational institutions, medical institutions, and large enterprise groups [2]. As one of the key services, data outsourcing [3], [4], [5], [6], [7], [8], [9], [10] solves the problem of storing large amounts of data at low cost for users with limited storage resources and has been widely used.
由于其按需服务、弹性、灵活性、多租户和高效访问数据以及降低维护成本等显著特点,云计算已被广泛部署并成为许多公司和机构不可或缺的一部分 [1] 。各种基于云的服务和应用程序正在逐步丰富,包括许多政府、教育机构、医疗机构和大型企业集团 [2] 。作为关键服务之一,数据外包 [3][4][5][6][7][8][9][10] 解决了用户存储资源有限的大量数据存储低成本的问题,并得到了广泛应用。

If data is directly stored in the cloud, users have the risk of revealing privacy. This is very critical especially for high-impact business data or medical records. For most the service providers don’t guarantee not to see or modify clients’ data. A cloud server also may be selfish in order to save its computation or transmission overhead [14], [16]. So it is essential to provide security and privacy guarantees such as confidentiality, integrity, freshness, authenticity, and verification in addition to user experience, conventional operability and management such as updating, retrieving, and supporting multi-user, for an elegant and practical outsourcing storage scheme.

In order to ensure data confidentiality without losing operability and management of the data, many SSE schemes [11], [17], [18], [19], [20], [21] are proposed at beginning stage. However, most of these SSE schemes assume that the cloud server is honest but curious [19], [20]. This assumption is not always valid. Because the cloud server may suffer from external attacks, internal configuration errors, software vulnerabilities, and internal personnel threats [11], [18]. The data may be tampered or the server may return a cached history or partial results.

To prevent malicious server from returning partial results or tampered results, integrity check should be supported. However, most of the schemes ignore this or other issues. For example, some existing SSE schemes do not guarantee the integrity of the search results returned by the server to the user [17], [18], [19], [20]. Although there were some schemes could ensure the integrity of data [6], [22] and some verifiable SSE schemes have also been extensively studied [11], [13], [14], [23], [24], [25], [26], unfortunately, these scenarios only support validation in static databases [11], [26]. [21] cannot prevent a malicious server from returning an empty result. [26] using cuckoo-hash solves the problem of the server returning null results. However, it does not support dynamic updates for data validation.

In addition, most of the currently proposed SSE schemes are two-party models between data owners and servers. [15] proposed a verifiable three-party model SSE scheme based on the Merkle Tree Index. The three-party model includes the data owner, users, and servers. However, one drawback of [15] is that the server needs to traverse the whole index when the keyword queried is not present. In this situation, the server will take some time to traverse the whole index, create the results for verification, then sends the results to the user. And the user will take some time to verify the results, but the user get an empty result. The user experience will be greatly affected in this scenario. There are many cases which can leads to this scenario such as querying a non-existent keyword in the index, spelling errors, hitting an adjacent key when typing, etc. And this drawback exists in most of the current SSE schemes. We will give a detailed analysis in later section.

Table 1 compares various existing verifiable SSE schemes with our proposed scheme. To the best of our knowledge, none of the existing verifiable SSE schemes can explicitly allow users to verify the absence of keywords in the three-party model efficiently. In summary, the performance and security functionalities are not fully exploited.

TABLE 1 Comparison With Existing Typical Verifiable SSE Schemes
Table 1- 
Comparison With Existing Typical Verifiable SSE Schemes

Challenges. Although there have been many SSE schemes, there are still some challenges needed to be fully explored.

  • First, when a user makes a search query, how does the user verify the integrity of the results returned by the server? In our scheme, the data owner uses B+-Tree to build the index and hashes all nodes to generate a root using several hash functions. The user verifies the integrity of the search results through this root. We leverage two methods to generate the root of B+-Tree. One is to hash all the nodes to generate the root of the B+-Tree. The other is to hash only the leaf nodes to generate the root of the B+-Tree. Since its leaf nodes contain all key-value pairs, the integrity of the search results can also be verified. Although there are many applications based on B+-Tree, they cannot utilized directly in such context providing excellent privacy and security guarantees such as integrity checking, verification, etc.

  • Second, how does a scheme prevent malicious servers from launching replay attacks and support three-party model? [13] proposes a timestamp-based scheme to solve this problem, but it can be only applied to the two-party model. We address this issue by combining the timestamp mechanism with the root of the B+-Tree and authorized users.

  • Finally, costs and user experience are common issues for most of the current schemes. In traditional verifiable SSE schemes, whenever a keyword queried by the user exist or not, the server always needs to traverse the whole secure index and create the corresponding authenticator. It is not efficient for both of the server and the users. New idea should be adopted to differentiate the two situations to boost the performance and user experience. The difficulty is how to make the new designed scheme compatible with previous scheme and also keep the scheme supporting dynamic updating. Besides, it is essential to save users’ costs. For some cloud service provider not only charges based on the size of the storage resource requested by the client but also charges based on the amount of calculation [27], [28], [29]. How can the server avoid unnecessary searches? Therefore, it is necessary to improve the search efficiency of SSE. At present, most SSE schemes do not take into account charges. The server’s query efficiency is the same whether the queried tokens are present or absent.

Here, we design a specific CBF that supports efficient dynamic updating and verification to solve the final challenge. Our methodology is straightforward but effective. When the keyword queried by the user does not exist, the server only needs to return a Bloom Filter authenticator to the user and avoids to traverse the whole index and create the corresponding authenticator. The performance can be greatly improved and is independent of the scale of the dataset when the searching keyword is missing.

In Conclusion. Our contributions can be summarized as follows:

  • We propose an Efficient, Secure, and Verifiable SSE scheme under the three-party model. It can guarantee the integrity of the search query results returned by the server to the user and the freshness of the result.

  • Our secure index is based on B+-Tree and a specific CBF which supports secure verification, dynamic updating, and multi-user query. To the best of our knowledge, it is the first B+-Tree based SSE scheme which enables efficient verification, integrity checking, and dynamic updating.

  • The new designed CBF with counting capabilities supports efficient update and verification. It can greatly save both the server overhead for searching and users’ cost for verification, especially when the keyword queried by the user does not exist.

  • We verify the feasibility and safety of the scheme through experiments. The results show that the CBF can achieve stable, more efficient query and verification when the queried keyword does not exist.

Organization. The rest of the paper is organized as follows: Section 2 presents the related work. We introduce the threat model and our design goal in Section 3 and give a system overview of the proposed scheme in Section 4. The detailed construction is presented in Section 5. Then, we provide the security analysis in Section 6 and performance analysis and evaluation in Section 7 respectively. Finally, we present our conclusion in Section 8.

SECTION 2

Related Work

Encrypted Storage Outsourcing. [30], [31] allows the client with limited resources to outsource large amounts of data to cloud service companies at a low cost. Since the data were directly encrypted and stored in the cloud server, users cannot directly query on the encrypted data. If the user wants to update the database. Users are required to download the data locally. Then update the database and upload it to the server.

Searchable Encryption. In order to query the encrypted data of the cloud server, some relevant technologies were proposed to solve this problem [32], [33], [34], [35]. SSE is one of the most important schemes to solve this problem. SSE is a search query scheme based on the keyword index. It can perform operations such as querying on cloud-encrypted data. In the work [5], the authors provided a privacy-protection framework for outsourced media search. The work relies on multimedia hashing and symmetric encryption and tries to balance the strength of privacy enforcement, the quality of search, and the computation complexity. In general, we summarize the related work in the following.

Verifiable Searchable Symmetric Encryption. Some of the most recent work focus on forward security or backward security. Most of them consider honest but curious server which follow the defined protocol. Although these schemes can provide forward or backward security, the search results cannot be verified efficiently when the servers perform active attacks. Based on these, some schemes with verification functionality were proposed [12], [15], [36]. Relevant verifiable SSE encryption technology was proposed [12], [15], [36]. [21] proposed dynamic authenticators are used to checking the integrity of the results, but cannot prevent replay attacks by malicious servers. For the verifiable scheme proposed in [14], [24], [37], either the query efficiency is a little gloomy, or it cannot support the validation in the case of updating. The verifiable universal scheme proposed by [37] is to transform the symmetric searchable encryption scheme into a no-dictionary verifiable scheme. Users do not have to maintain a collection of all keywords, but it is a static approach. [13] solved it by using the message verification code, but it can’t detect that the server intentionally returned an empty result. [15] proposes a secure verification scheme, and it can prevent replay attacks from cloud servers. However, this solution does not solve the problem that the keyword of the user query does not occur.

Multi-User Searchable Encryption. Some work tried to enhance the functionality and performance for data outsourcing and cloud services. The original SSE schemes work in a two-party model. The natural extension of the two-party model of searchable encryption scheme, multi-user model search encryption schemes are constantly being proposed. [15], [20], [36] these scenarios are typical multi-user search encryption schemes. [20] is the first time to propose a multi-user model encryption scheme based on broadcast encryption. [15] leveraged Merkle Patricia Tree and Incremental Hash to build the proof index of supporting data updates. [36] is presented which utilizes a two-keyword index to reduce the searching time. However, [36] has a shortcoming. The index of the two keywords used in the paper is comprised of all the keywords and all combinations of two keywords from all the keywords. This will significantly increase the storage overhead and search overhead of the server. When the queried keyword does not exist, [15], [36] has the same efficiency as the keyword exists. This reduces the search efficiency when the keyword is absent. [38] proposed a non-interactive multi-user searchable encryption scheme that reduced the interactions between the data owner and users. However, the scheme did not support the update of data. [3] tries to utilize caches, encryption and compression to improve performance and reduce the size of data transfers. [39] studies the problem of secure skyline queries over encrypted data. These solutions are still not full-fledged such as performance, verification, etc.

SECTION 3

Problem Statement

In this section, we will describe the attack model of SSE. Then, we formally present our design goals.

3.1 Threat Model

We assume that the data owner is credible and the users authorized by the data owner are also trusted. The cloud servers store the encrypted index and documents and perform SSE. We consider the cloud servers are untrusted and malicious which may perform active attacks. The server might deduce some sensitive information from the search results. Besides, when a client lunch a query, the servers may return partial search results to the user. Or the servers may initiate replay attacks. We try to mitigate such attacks in our scheme.

Definition 1 Replay Attack.

Replay attacks of SSE are attempted by a malicious server (or attacker) to return search results to a user from a historical version of the dataset.

Definition 2 Data Integrity Attack.

Data integrity attacks in SSE is that a malicious server (or attacker) deliberately return partial search results or empty result to the user.

In addition, we don’t consider the access pattern leakage and query pattern leakage which can be tackled by oblivious methodologies. Also we don’t consider the information leakage through side channel or timing channel.

3.2 Design Goal

In this paper, our goal is to design an Efficient, Secure, Verifiable SSE scheme on the three-party model. We try to achieve the following goals:

  1. The user can verify the correctness of the received result from the server.

  2. The user can detect whether the server is launching a replaying attack.

  3. The overall performance can be reasonably improved. For example, when the server receives the users query, the server can save the overall cost through our specific algorithm and data structure.

  4. The proposed data structure and algorithm can support efficient dynamic updating.

SECTION 4

Overview Of ESVSSE

In this section, we will briefly describe our scheme. The main symbols used in this paper are given in Table 2.

TABLE 2 Notations
Table 2- 
Notations

4.1 System Structure

Fig. 1 shows the overall structure of our scheme. It consists of data owners, cloud servers, and authorized users. The data owner stores the encrypted data in a cloud server. Authorized users can search the data of the cloud server. Data owners have administrative rights over data in the cloud.

Fig. 1. - 
The system architecture of ESVSSE on the three-party model.
Fig. 1.

The system architecture of ESVSSE on the three-party model.

4.2 System Model

We aim to develop an Efficient, Secure, Verifiable Symmetric Searchable Encryption scheme(ESVSSE). The data owner uploads the encrypted documents, authenticator and the security index using the B+-Tree to the cloud server. This scheme allows the user to verify the integrity and freshness of the search results. Our ESVSSE scheme is defined as follows: ESVSSE is a three-party model where data owners store secure indexes, authenticators, and encrypted documents on the cloud server. The data owner can authorize users to query the cloud servers. The authorization procedure is similar to [20]. The cloud server provide storage and search capabilities. Authorized users can initiate query and verification operations.

Definition 3. ESVSSE is a collection of seven polynomial-time algorithms.

  • Gen (1k){K1,K2,K3,(ssk,spk)}: is a probabilistic algorithm that takes as input a security parameter k and outputs private keys K1,K2,K3, and a random signing key pair (ssk, spk) and it is executed by the data owner.

  • Init(K1,K2,K3,ssk,D){I,π,πbf,C }: is an algorithm executed by the data owner. It takes as input the secret keys K1,K2,K3, the signing private key ssk, and the document set D, and outputs the secure index I, encrypted documents C, the authenticator π and the authenticator of Counting Bloom Filter πbf. The πbf is an authenticator, which can provide proof when the keyword queried by the user does not exist. The data owner stores the I and πbf locally and meanwhile sends I, C, π, and πbf to the cloud server.

  • PreUpdate(K1,K2,K3,ssk,f){τω,π,πbf}: is an algorithm that takes as private keys K1,K2,K3, and the signing private key ssk, and the file f to be updated, and outputs the update tokens τω and the authenticator π. The data owner runs the algorithm and sends τω, πbf, and authenticator π to the cloud server.

  • Update(I,τω){I}: is an algorithm that takes as secure index I and updated tokens τω, and outputs the new secure index I, and is run by the cloud server.

  • Trapdoor(K1,ω){τω}: is a deterministic algorithm run by the authenticated user to generate a trapdoor for a given keyword. It takes as input a secret key K1 and a keyword ω, and outputs a trapdoor τω corresponding to ω.

  • Search(I,τω,tq){(ρ,πtq,πc,Rω)or(πbf)}: is a deterministic algorithm run by the server. It takes as input the secure index I, the trapdoor τω, the query time tq, and outputs the path list ρ of search results, the authenticatorπtq,πc, and encrypted documents Rω of search results. Finally, the server sends ρ,πtq,πc, and Rω to the user, or output a proof authenticator πbf.

  • Validation(K1,K2,K3,spk,Rω,ρ,πtq,πc,τω,πbf,){0,1} : is an algorithm run by the authenticated user that takes as input symmetric keys K1,K2,K3, the public key spk, the query result Rω, the secure list ρ, authenticator πtq,πc, πbf, and the token τω corresponding to ω, it outputs a bit b. It represents the user accepts the result of the query when b is equal to 1, and rejects the search results when b is equal to 0.

    πi,0=(νi,0,Sig(νi,0))upi<tpi,0upi+1νi,0=EncK3(rti,0||tpi,0)πi,j=(νi,j,Sig(νi,j))tpi,j1<tpi,jupi+1νi,j=EncK3(rti,j||tpi,j||νi,j1)πi,n=(νi,n,Sig(νi,n))tpi,n=upi+1νi,n=EncK3(rti,n||tpi,nνi,n1)(1)
    View SourceRight-click on figure for MathML and additional features.
    {πnbf=(ψnbf,Sig(ψnbf))ψnbf=EncK3(CBF||tpi,n||ψn1bf)(2)
    View SourceRight-click on figure for MathML and additional features.

SECTION 5

The Construction of ESVSSE

In this section, by starting with algorithms to build the secure index, and then well describe the detailed algorithm about the query procedure and verifying search results. Finally, we illustrate ESVSSE through an example.

5.1 Building Secure Index

We use the B+-Tree structure to build a secure index. The index is built from the collection of documents D and the inverted list Δ calculated from the documents set. The inverted list includes four parts: all keywords, the documents corresponding to the keywords, the trapdoor corresponding to the keywords, and the keyword corresponds to the hash value of the document. We build a B+-Tree using key-value pairs and the value only stored in the leaf node of the B+-Tree. In order to prevent cloud server from launching replay attacks and ensure the freshness of the root, we maintain a timestamp-chain for authenticators, such that users can trace authenticators in the chain and ensure if the root is fresh. The data owner generates the authenticator according to Equation (1). Here i represents the ith update interval, and j represents the jth authenticator in the interval. νi,jrepresents the encryption of the timestamp tpi,j and root rti,j of the B+-Tree. πi,j includes νi,jand includes the signature of the data owner to νi,j. πbf is calculated according to Equation (2). πbf is an authenticator that maintains a timestamp chain and a signature. CBF is a Bloom Filter that includes all tokens and supports efficient updating. The pseudo code for Init algorithm is shown in Fig. 2.

Fig. 2. - 
Algorithm. Init.
Fig. 2.

Algorithm. Init.

An example illustrates the Bloom Filter as shown in the Fig. 3. Suppose the data owner has three tokens, the server hashes it through the four hash functions to the Bloom Filter. Suppose the user gives a token to the query. The server checks if there is a τωicorresponding to it in the Bloom Filter. All it takes is to hash the user’s query τωi into the Bloom Filter using four hash functions. The cloud server only needs to check if the corresponding result is all 1 corresponding to it. If all is 1, the τωi to be queried exists, otherwise it does not exist. Fig. 4. is a case where the trapdoor to be queried does not exist. Due to the ordinary Bloom Filter does not support dynamic updating. So we introduce Counting Bloom Filter to solve this problem through an array to count the number of occurrences of each bit. As shown Fig. 5, if there are two keywords. The token is hashed to the Bloom Filter by 3 hash functions. The one-dimensional array is used to count the number of occurrences of the corresponding bit. For the add operation, the corresponding bits are increased by 1. And the bits corresponding to the delete operation are decreased by 1.

Fig. 3. - 
The token queried by the user exists in the Bloom Filter.
Fig. 3.

The token queried by the user exists in the Bloom Filter.

Fig. 4. - 
The token queried by the user don’t exists in the Bloom Filter.
Fig. 4.

The token queried by the user don’t exists in the Bloom Filter.

Fig. 5. - 
Description of the counting bloom filter.
Fig. 5.

Description of the counting bloom filter.

5.2 Query Procedure

When an authorized user needs to query the data of the cloud server. First, the user generates a token for the keyword. Then, the user sends the token to the cloud server. The cloud server performs the search algorithm. The specific algorithm is shown in Fig. 6. When the cloud server receives the token sent by the user,

Fig. 6. - 
Algorithm.Search.
Fig. 6.

Algorithm.Search.

we consider both conditions that the token exists or it is absent in the index. First, the server checks whether the token exists in the Counting Bloom Filter. The server only needs to hash the queried token to another Counting Bloom Filter with k hash functions. The server verifies whether the token queried by the user exists by testing whether all the corresponding bits are 1. There are two cases. (1). If not all the testing bits are 1, the token is proved to be absent. In this case, the server only needs to return πbf to the user without traverse the B+-Tree. And the searching complexity is O(1). This is very efficient for the case. (2). All the testing bits are 1, and the token is proved to exist. Then the server needs to search the B+-tree index and return the token of each node in the search path σ from the bottom to the root, excluding the leaf node itself. Note that, for an internal node, we also need to return the tokens that are not in the search path σ. Then, the server needs to find the authenticator πtq and checkpoint authenticator πc that is close to the query time. The server sends the ρ,πtq,πc Cωito the user.

5.3 Verifying the Search Results

For the verification of data integrity, we use a hash function to hash each node of the B+-Tree from the leaf node to the root node layer by layer. The user can verify the integrity of the query results according to the root node. Another way is that the data owner only hashes all the leaf nodes. This paper uses a timestamp mechanism to verify the freshness of the data. The timestamp mechanism is shown in Fig. 7. It includes the update interval and the authenticator π. The update interval is set by the data owner based on the frequency of the update. The authenticator is composed of the root and the timestamp of the B+-Tree.

Fig. 7. - 
Description of the timestamp-chain mechanism.
Fig. 7.

Description of the timestamp-chain mechanism.

In order to prevent the server from returning partial or historical versions of the query results. The authorized user performs the verification algorithm (shown in Fig. 8) when the user receives the query result from the cloud server. Similarly, there are two cases in which the results are verified. One is that the keyword does not exist and the other is that the keyword exists.

Fig. 8. - 
Algorithm.Validation.
Fig. 8.

Algorithm.Validation.

When the queried keyword does not exist, the user can use the πbf authenticator to check whether the keyword is really non-existent. Through the verification algorithm, the user can verify whether the server is malicious. The CBF authenticator consists of CBF Bloom Filter with time-stamp and the corresponding signature of the data owner. The user can check the non-existence of tokens through CBF. The freshness of the CBF can be checked by the time-stamp mechanism. If the verification algorithm returns 1, the user accepts the result from the server. Otherwise, it rejects the returned result. The user can prove whether the server is malicious. And the complexity of the verification is O(1). And the cost for the user is negligible.

When the searched keyword exists, the user needs to build the root of the B+-Tree based on the results returned by the server and use the root to verify the integrity of search results. The freshness of the results is verified based on the timestamp in the decryption authenticator. The user can build root based on the list returned by the server and verify the integrity of the returned result by comparing it with the root of the authenticator.

Let us explain what is a replay attack by the following example. Let’s consider the following situations: 1. When the user queries at t1(see Fig. 7.), wheret1 <tpi,0, the server can only return πi1,n to the user. When a user queries at t2after the data update event at tpi,0, the server has two options: the one is returns the authenticator πi,0to the user. the other is that sends the authenticator πi1,n to the user. When the server returns πi1,n to the user, the server launches a freshness attack.

5.4 Update

The Update algorithm updates the secure index on the server and supports three operations, i.e., the add, delete, and edit operations. The data owner first executes the pre-update algorithm to create the tokens for the update algorithm, CBF, the signature and sends them to the cloud server. Then, the cloud server executes the update algorithm to update the CBF and the secure index I. Since both the security index and the files are encrypted. The server cannot learn anything useful from it. For add operation, the data owner needs to generate {τω,GK2(f)}pairs, where τω is the token of a specific keyword extracted from the file f, and GK2(f) is a pseudo-random string of f. The cloud server finds the corresponding leaf node based on its token and the value {IHK2(f)} to the existing node value. If the token does not have a corresponding leaf node, a new leaf node will be created. The corresponding bits in the Counting Bloom Filter also need to be increased by 1(see Fig. 5.)The authenticator πbf and π also need to be updated by the data owner. For π and πbf, the update is performed according to Equations (1) and (2) respectively. The data owner sends the updated authenticator πbf and π to the cloud server. The delete operation is similar to the add operation. And, the edit operation is equivalent to add a new file after deleting an old file.

5.5 An Illustrative Example

We illustrate our scheme through an example. Suppose there are nine files to initialize, i.e.{f1,f2,f3,f4,f5,f6,f7,f8,f9}, which contain nine keywords presented in the inverted list (see the Fig. 9). The query tokens and values corresponding to the keyword are given in the inverted list (Fig. 9). Initially, we build the secure index by inserting the key-value pairs into B+-Tree (see Fig. 10). The data owner hashes all tokens to the Counting Bloom filter to generate a πbf. An authenticator π is generated according to Equation (1). For an update operation, adding a new file f10 that includes ω5 and ω10, the update tokens are divided into ['4af8',IH(GK2(f10))] and ['ff56' IH(GK2(f10))]. For the token '4af8' that already exists, the server needs to add IH(GK2(f10)) after the original node. For the new token 'ff56', the server needs to create a new node and assigns a new value IH(GK2(f10)) to it. The data owner needs to authenticator πbf and π accordingly.

Fig. 9. - 
The inverted list.
Fig. 9.

The inverted list.

Fig. 10. - 
The index of B+-Tree.
Fig. 10.

The index of B+-Tree.

Now suppose that the user wants to query the keyword ω3 and submit the corresponding token '30ba'. After the server receives the users token, it first tests whether it exists in the Bloom Filter. If it exists, the server searches for it to the user. The search path of this token is {IN1, LN2}, and the search list is [Mr2,Mr1,Mr0](see Fig. 11). When the user receives the results list ρ from the server, the user runs the validation algorithm. The user can generate the root hash according to the list ρ. If there is an attack, for example, the server only returns the file f3. The user can rebuild root hash values which will not be the same as the root of the authenticator. So it will return 0. If they are equal, it proofs server returns the correct result for the user. Suppose the user submits a token that does not exist in the proof index, e.g., token 30bb. The server only needs to send the user a proof authenticator πbf. The user verifies the signature then hashes the queried token to a Bloom Filter CBF. The correctness of the result is verified by comparing the CBF with the CBF, which decryption in the authenticator πbf.

Fig. 11. - 
Search list $\rho$ρ of an existing token '30ba'.
Fig. 11.

Search list ρ of an existing token '30ba'.

SECTION 6

Security Analysis

In this section, we will conduct a security analysis for our scheme. On the one hand, we should analyze the security and confidentiality of data on the cloud servers. On the other hand, we also make sure that users authorized by the data owner can verify the search results correctly. Confidentiality means that an attacker cannot learn any useful information from encrypted documents directly. Since the secure index is established by the token. Therefore, the cloud server cannot learn any useful information about the keyword. The data owner encrypts all the documents and stores them to the cloud server. So the server cannot learn any useful information from the encrypted documents.

Verifiability is the ability for the users to verify the integrity and freshness of search results. Users can verify that the result is tampered or attacked. The proof of general safety is demonstrated using Real/Ideal simulation. Definition (confidentiality) Let ESVSSE=(Gen, Init, Preupdate, Update, Trapdoor, Search, Validation) is an SSE scheme based on our CBF and B+-tree index, k is a security parameter. Suppose A is a stateful challenger, S is a stateful simulator, and L is a stateful leakage algorithm considering the following probabilistic experiments RealSSE,A(k) and IdealSSE,A,S(k).

RealSSE,A(k) A challenger runs Gen to generate the private keys K1,K2,K3,ssk, an adversary A selects the document set D and generates a security index I and a validator π for the challenger through the Init((K1,K2,K3,ssk,D){I,π}) algorithm. The challenger performs the number of queries q which is a polynomial number. For each query, A receives the token sent from the challenger, the token is obtained by the Trapdoor (K1,ω){τω} algorithm. The search result is obtained by the Search (I,τω,tq){ρ,πtq,πc}. The update token is used by the Preupdate (K1,K2,K3,ssk,f){τω,π} and Update (I,τω){I}algorithms. Finally, A returns a bit b. If b is equal to 1, it accepts, otherwise it rejects.

IdealSSE,A(k) : An adversary A selects document set D. Under the condition of the leakage function L, the simulator S generates a security index and a verifier through the Gen(1k){K1,K2,K3,(ssk,spk)} algorithm and sends it to the adversary A. The adversary A has performed the number of queries q of the polynomial number. For each query, S returns the token τω and validator π for the A response. Finally, A returns b. If b is equal to 1, it accepts, otherwise it rejects. If there is a probabilistic polynomial time simulator S for all the probabilistic polynomial time adversary A, we say that SSE is L-confidential. Theory 1. If F and G are pseudo-random functions, the SSE scheme is L-confidential.

Proof.

We will prove that there is a polynomial time simulator S and a probabilistic polynomial time adversary A, the output is indistinguishable through Real and Ideal. First, S simulates the secure index Iby randomly selecting key-value pairs from |I| and inserting them into the B+-tree. At the same time, S selects a random string π of length |π| as a verifier. Meanwhile, S needs to hash the tokens of all selected keywords in the bloom filter. Each key-value pair in the B+-tree is encrypted by a pseudo-random function, and the confidentiality of the verifier is guaranteed by key-value pair encryption. Therefore, A cannot distinguish (I,π) from (I,π). A initiates a search, and lets S simulates a search token τω. First, the token τω queried by A is hashed into the CBF to check whether the token being queried exists. If the token τω of the query exists at I, S selects a random selection of a result path and returns it to A. A cannot distinguish between a real token τω and a simulated token τω. For tokens that are queried later, if they have appeared before, they were the same as before, or the same as the first token of the simulation. Similarly, when A simulates the update of the token τω, the updated token is set to τω, and the verifier π is set to a random string of the same length as π. For each query, A randomly selects a string to simulate a search token. In the Real game, all tokens are encrypted by the pseudo-random function F, and the adversary A cannot distinguish whether the simulated token is from RealSSE,A(k) or IdealSSE,A,S(k).

The verifiability of a symmetric searchable encryption scheme is that the user can verify the integrity and freshness of the search results returned by the server. Integrity verification is to prevent the server from returning partial or incorrect search results. The verification of freshness is achieved through a timestamp mechanism to prevent the server from abandoning the replay attack, which prevents the server from returning historical search results to the user. Therefore, we use a game-based security definition to prove the verifiability of symmetric searchable encryption in ESVSSE.

Definition (verifiability): Assume that the ESVSSE scheme is a secure, dynamic, verifiable scheme based on symmetric searchable encryption. Consider the following probability experiment, and assume that A is a stateful attacker. Verify(k): The challenger runs the Gen(k) algorithm to generate the private key k. Adversary A selects document set D. The challenger creates a secure index I and a validator through the Init algorithm. The challenger initiates a query for the adversary A to search for the results C, and the validator π. When the challenger receives the results from adversary A, the results will be verified. Calculate b=validation (K1,K2,K3,spk,Rω,ρ,πtq,πcτω){0,1} by the verification algorithm. The final experimental output is b. We say that the ESVSSE scheme is verifiable if the adversary A is operating at the probability polynomial time

Pr[Verify(k)=1]negl(k).
View SourceRight-click on figure for MathML and additional features. Proof. Consider the situation, when the search results returned by the server are different from the correct search results, but the verification algorithm accepts the search results. To ensure that ESVSSE is a verifiable solution, we only need to prove the verification algorithm. Because the validator is generated and updated by the data owner, it is guaranteed by the underlying cryptography and digital signature. Without the signature key and private key K3, no one can change and generate the validator. Second, hash collisions can occur for verifiable algorithms. A hash collision can occur in both cases of generating a secure index and a hash path when the user authenticates. But since the hash function we use is anti-hash conflict, the likelihood of these two situations is small. Therefore, we can ensure that our verification algorithm is verifiable. So our ESVSSE scheme is verifiable. Through a series of simulation experiments and safety analysis, the ESVSSE solution is safe and feasible.

Besides, the query pattern and search pattern are leaked. This is common in many current schemes. And we can leverage oblivious ROM to alleviate such leakages.

SECTION 7

Performance Analysis and Evaluation

In this section, we give a detailed performance analysis and evaluation of our scheme.

7.1 Performance Analysis

For checking existent is independent of the searching algorithm. The average performance of our scheme depends on the missing rate of the keywords. We define mr as missing rate. Suppose that the mr equals a%, this means if the user performs n queries, there are n×a% queried token that don’t exist. And the complexity of checking CBF is O(1), and it is independent of the scale of the dataset. The complexity of searching the B+tree is about O(|W|)) where |W| is the number of the keyword set. The average performance of our searching algorithm is about O(1)×a%+O(log(|W|))×(100a)%. And the performance of verification is similar.

Whereas, the performance of the most current scheme with the same functionality is about O(log|W|). The overall performance improvement of our scheme depends on the missing rate of the queried keywords. The proposed CBF architecture is somewhat like a translation lookaside buffer in a memory system. And the CBF supports efficient dynamic updating. In addition, the performance of the users’ side is also improved similarly. This improvement is especially critical for energy or computation constrained devices.

In the real world, there are many cases that give rise to the increasing of the missing rate such as spelling errors, hitting an adjacent key when typing, etc. It also depends on how the index is built. For all these aspects vary across different people and different datasets, we will give a general analysis. People mainly operate computers through a keyboard interface. However, they often make spelling errors when using a keyboard, for example, hitting an adjacent key or using phonologically similar letters [40]. In [40], they define spelling errors as (1) a typo from an incorrect keyboard operation, (2) a spelling confusion from an incorrect identification. Their experiments show that the corrected errors, substitution errors were the most frequent, deletion and insertion errors were almost the same, and transposition errors were few. [41] studied the type of English spelling for students whose native language is English. Spelling errors of students were analyzed by grade-level spelling patterns and linguistic characteristics: phonological, orthographic, orthographic image, transposition, and morphological. Their results revealed that morphological spelling pattern errors occurred most frequently. Of the opportunities available for students to make linguistic spelling errors, phonological errors occurred 1.30 percent, orthographic errors occurred 8.52 percent, orthographic image errors occurred 8.23 percent, and morphological errors occurred 56.2 percent. The dataset also affect the built index as well as the way people build the index. For example, in the typical example of the Enron mailbox [42], if all the IDs are built in the index, it is hard for people to distinguish them.

7.2 Experiment Setup

In order to prove the feasibility of our proposed scheme, we have implemented it by using Java with the JDK 8.0. The prototype is written by about 4000+ lines of codes. The operating system we used is a 64-bit window7 system and the computer is with an i5-6500 CPU and 16 GB RAM. We use Python language to extract keywords. We use 128-bit AES to encrypt the authenticator and signs it with RSA signature. We implement two random-oracles with HMAC-SHA256 and SHA3-256. We use the Enron mailbox dataset [42]. The following sequences introduce our various aspects of experimental and testing results.

7.3 Index Construction Cost

First, we measure the delays of the initialization algorithm performed by the data owner as shown in Fig. 12. The Init algorithm includes the construction of the secure index, CBF, and the generation of the authenticator. All the delays and subsequent measurements are the average results with ten runs of experiments. We tested the delays of 500,000 key-value pairs for establishing a B+-Tree security index of a different order. We mainly test the relationship between the number of document keyword pairs and the total time required to generate a secure index and authenticator. We test the order of 100, 500, and 900 B+-Tree. As the number of key-value pairs increased, the time consumed is also increased. When the size of the key-value pair is less than 100,000, the time overhead is similar. When the number of key-value pairs is between 100,000 and 200,000, the time overhead of the 900th order is slightly larger than the other two, and the total time of the 100th and 500th order B+-Tree initialization is almost the same. When it is greater than 200,000, the B+-Tree has the minimum initialization time overhead of 100 steps, and the time cost of 900 orders is the largest. Overall, the initialization consumes around 1.5 seconds where the documents include 500,000 keywords, which is acceptable.

Fig. 12. - 
Init delays.
Fig. 12.

Init delays.

We measure the storage cost B+-Tree as shown in Fig. 13. The dataset contains 500,000 keywords. The storage overhead is about 100 MB. As the number of keywords continues to increase, the storage overhead also increases linearly. We tested the storage space of B+-Tree of different orders. As shown in Fig. 13, the storage space of the B+-Tree has little relationship with the order of the tree.

Fig. 13. - 
Storage cost of B+-Tree.
Fig. 13.

Storage cost of B+-Tree.

7.4 Performance of Update

The update delay is decided by the size of the database that is measured by the number of keywords. Strictly speaking, the delay is directly related to the order of the B+-Tree. To show the relationship between update latency and database size, we use different numbers of keywords to measure latency. Since the number of keywords varies from each file, we use the computing overhead to measure the number of keyword-document pairs that can be updated per second (see Fig. 14). When the number of key-value pairs is between 100,000 and 200,000, the computing overhead of delete operations decreases as the number of keywords increases. The computing overhead is almost balanced between 200,000 and 500,000. At the same time, the computing overhead and the order in this interval also has a certain relationship. As the order increases, the computing overhead of the delete operation is decreasing. In the case of a certain dataset, the computing overhead of the deletion follows the increase of the order, with a gap of 10,000 to 20,000. We observed that the computing overhead of adding and delete operations is almost the same. Both the computing overhead of adding and deleting are relatively stable.

Fig. 14. - 
Update thoughput.
Fig. 14.

Update thoughput.

7.5 Performance of Query

When the user launches a query operation, the token of the corresponding keyword is generated, and then the token is sent to the server. Fig. 15 shows the relationship between the number of keyword documents pairs and the search time when the server performs a search. Note that this experiment only measures the cost of generating the proof, and the the waiting time for the authenticator in the checkpoint is not included. Fig. 16 shows the query time performed on the CBF when the keyword queried by the user does not exist. As we can see from the picture, the servers search time is about 0.03 milliseconds in the database containing 500,000 tokens. The time cost is stable and can be negligible. For the server just needs to return the corresponding CBF authenticator when there is no matching in the CBF. The complexity is about O(1) and the results are consistent with our performance analysis.

Fig. 15. - 
Search delay for tokens existence.
Fig. 15.

Search delay for tokens existence.

Fig. 16. - 
Search delay for tokens not present.
Fig. 16.

Search delay for tokens not present.

7.6 Performance of Verification

In Fig. 17, We evaluate the verification delays for the authorized users by the data owner. We can see that the search time is proportional to the number of keyword-document pairs. The larger the number of keywords, the longer the server will take to search. In the case where the number of keywords is constant, the time for the server to search is also related to the order of the B+-Tree. As shown in Fig. 17, the smaller the order of the B+-Tree, the longer the server searches. When the order of the B+-Tree is 500 and 900 respectively, the verification delays are similar, but search time delay is different. A database of 500,000 key-value pairs with a verification delay of approximately 4 milliseconds. As shown in Fig. 17, the user’s verification time is proportional to the size of the data set. When the data set is fixed, the time of verification cost is stable and very short. Whereas the validation cost of the other two schemes increases linearly.

Fig. 17. - 
Verify latency for token exist.
Fig. 17.

Verify latency for token exist.

Fig. 18 illustrates the verification time when a keyword queried by a user does not exist and compares the result with previous work. Here, the verification time includes the time when the user decrypts the verifier and the time to perform the comparison verification. Since the time for the user to hash the queried keywords to the CBF is very efficient, it can be ignored. We tested the time trend of the CBF authenticator as the number of keywords changed. At the same time, we compare with the two schemes. The MVSSE in the Fig. 18 is the one proposed in the [36]. VSSE is the solution proposed in [15]. As can be seen from the figure, as the number of keywords increases, the ESVSSE scheme verification time is almost constant. The other two options are that as the number of keywords increases, so does the verification time. In general, our solution is more efficient than existing solutions under the same attack model conditions.

Fig. 18. - 
Authenticator $\pi _{bf}$πbf delays for token does not exist.
Fig. 18.

Authenticator πbf delays for token does not exist.

Fig. 19. shows the throughput for authenticator. Here, η is the update frequency of the data owner. The size of the first authenticator in each update interval is around 176 bytes, which includes 32 bytes of the root of B+-Tree, 8 bytes of the timestamp, an 8 bytes AES-CBC extension and a 128 bytes RSA signature. They consist of two parts: the overhead introduced by the fixed update time points and delays from data updates. We can see that when the update interval is reduced to 0, the bandwidth overhead is increased to about 2 KB per second, which is introduced by a fixed update time point, and it is inversely proportional to the bandwidth overhead. This bandwidth is introduced by the length of the authenticator. As the update interval grows, the length of the authenticator becomes larger.

Fig. 19. - 
Bandwidth consumption.
Fig. 19.

Bandwidth consumption.

7.7 Storage Cost of Verification

In order to better verify the communication overhead of this scheme during the verification phase, we test the storage cost of the validator in different situations by testing the keyword document. Since this solution is divided into two cases, one is that the keyword searched by the user exists(see Fig. 20), and the other is that the keyword does not exist(see Fig. 21). And we test the size of the memory occupied by the verifier when the B+-tree order is different. It can be seen from the experiment that the size of the validator is also related to the B+-tree order. When the search token exists, the minimum validator size of the document keyword pair with a data set of 500,000 is 400 KB. When the order is 100, the memory occupied by the validator is 1100 KB. Therefore, the appropriate order is also a factor to reduce storage overhead. When the keyword does not exist, because the server returns the CBF verifier. The CBF verifier is all the leaf nodes of the hash B-tree, so it has nothing to do with the tree order. When the keyword document pair is 500,000, the storage overhead occupied by the validator is 1220 KB. In real life, it is full of data in various formats,Not just data in document format, so the index validator is still acceptable at the KB level.

Fig. 20. - 
The storage cost of Authenticator.
Fig. 20.

The storage cost of Authenticator.

Fig. 21. - 
The storage cost of CBF Authenticator.
Fig. 21.

The storage cost of CBF Authenticator.

7.8 Comparison With Existing Schemes

We compare our scheme with the concrete implementation of dynamic SSE scheme proposed by Cash et al. [20] and Jie Zhu et al. [15]. The experiment Results show that the additional overhead introduced by ESVSSE scheme is not large. In order to justly compare the schemes, we test with the same dataset and parameters on our machine. As shown in Fig. 22, we measure the overall overhead of the initialization phase, the search phase and the update phase in those schemes. The average performance is calculated through ta×a%+te×(100a)% where ta is the time cost for the keyword is absent, te is the time cost for the keyword exist, and mr is the missing rate of keywords. As shown in Fig. 22, the average performance is improved along with the missing rate mr. The results is consistent with our performance analysis.

Fig. 22. - 
Comparison with other SSE.
Fig. 22.  图 22。

Comparison with other SSE.
与其他 SSE 的比较。

We verify the feasibility of this solution by comparing with the well-known dynamic SSE solution and the universal verifiable solution. In order to compare these schemes fairly, we use the same data set and parameters for testing on our machines. We tested on the famous mailbox data set, which contains 500,000 documents and extracted 2 million document keyword pairs from it. The test mainly includes two stages, initialization, search and update stage. Initialization is a test for indexing 2 million document-keywords, and testing for 10,000 data in the search and verification phase. In Table 3. Due to differences in search results, we report the average communication cost based on 5,000 runs. It can be seen from Table 3 that the average search token of SSE [20] is 390B, and the search result is 53 KB. The token of the GSSE scheme is 32B, and the search and verification overhead is 3 KB. In our scheme, the storage overhead of search verification is 2.7 KB. This shows that the overhead generated by EVSSE is better and acceptable.
我们通过与众所周知的动态 SSE 解决方案和通用可验证解决方案进行比较来验证该解决方案的可行性。为了公平比较这些方案,我们在我们的机器上使用相同的数据集和参数进行测试。我们在著名的邮箱数据集上进行了测试,该数据集包含 500,000 份文件,并从中提取了 2 百万份文件关键字对。测试主要包括两个阶段,初始化、搜索和更新阶段。初始化是对 2 百万份文件关键字建立索引的测试,并在搜索和验证阶段对 10,000 份数据进行测试。在 Table 3 。由于搜索结果不同,我们基于 5,000 次运行报告平均通信成本。从 Table 3 可以看出,SSE [20] 的平均搜索令牌为 390B,搜索结果为 53 KB。GSSE 方案的令牌为 32B,搜索和验证开销为 3 KB。在我们的方案中,搜索验证的存储开销为 2.7 KB。这表明 EVSSE 产生的开销更好且可接受。

TABLE 3 Comparison With the SSE Scheme [20] and [15]
与 SSE 方案 [20] 和 [15] 的比较表 3
Table 3- 
Comparison With the SSE Scheme [20] and [15]
SECTION 8 第 8 节

Future Direction and Conclusion
未来方向和结论

In future work, we can consider more complex searches, such as multiple keyword extraction and conjunction query search. In addition, the Bloom filter has the potential for errors, but the probability of errors is negligible. However, comparing with the improved computational overhead, the probability of this error is still acceptable.
在未来的工作中,我们可以考虑更复杂的搜索,比如多关键词提取和连接查询搜索。此外,布隆过滤器存在错误的潜在可能性,但错误的概率是可以忽略不计的。然而,与改进后的计算开销相比,这种错误的概率仍然是可以接受的。

As graph data continue to increase such as social networks and biological networks. Current SSE cannot fully satisfy queries of graph data. In the future, we can study more complex search directions, such as matrix queries, graph adjacency queries, and so on. With the rise of Network Function Virtualization (NFV) technology, some studies have applied searchable encryption technology to the middlebox. Due to the excellent characteristics of the blockchain, the research on combining the blockchain with searchable encryption is also a direction.
随着图数据的不断增加,比如社交网络和生物网络。当前的 SSE 无法完全满足图数据的查询需求。未来,我们可以研究更复杂的搜索方向,比如矩阵查询、图邻接查询等。随着网络功能虚拟化(NFV)技术的兴起,一些研究已将可搜索加密技术应用于中间盒。由于区块链的出色特性,将区块链与可搜索加密相结合的研究也是一个方向。

In summary, this paper presents an efficient SSE scheme based on B+-Tree and CBF which supports secure verification, dynamic updating, and multi-user queries. Due to the CBF, we only need the complexity of O(1) to verify if the token exists for both the server and the user. Our CBF also supports efficient updating. When the token exists, we compose the authenticator by encrypting and signing the root and timestamp of the B+-Tree. Users can check the integrity of the results returned by the server through the authenticator. Finally, we implement our scheme and conduct comprehensive experiments to evaluate our scheme. The results are reasonable and consistent with our performance analysis. And the time cost is stable and very low when the query doesn’t exist due to the CBF.
总之,本文提出了一种基于 B+-Tree 和 CBF 的高效 SSE 方案,支持安全验证、动态更新和多用户查询。由于 CBF,我们只需要复杂度为 O(1) 来验证服务器和用户是否存在令牌。我们的 CBF 还支持高效的更新。当令牌存在时,我们通过加密和签名 B+-Tree 的根和时间戳来组成认证器。用户可以通过认证器检查服务器返回结果的完整性。最后,我们实现了我们的方案并进行了全面的实验来评估我们的方案。结果合理且与我们的性能分析一致。由于 CBF,当查询不存在时,时间成本稳定且非常低。

ACKNOWLEDGMENTS 致谢

The authors thank the anonymous reviewers, and the editor for their valuable comments and suggestions. This work was partially supported by the National Natural Science Foundation of China (No. 61672176), the Guangxi “Bagui Scholar” Teams for In-novation and Research Project, the Guangxi Talent Highland Project of Big Data Intelligence and Application, the Guangxi Science and Technology Plan Projects No.AD20159039, the Guangxi Young and Middle-aged Ability Improvement Project No.2020KY02032, the Innovation Project of Guangxi Graduate Education No.YCSW2020110.
作者感谢匿名审稿人和编辑对他们宝贵的评论和建议。这项工作得到了中国国家自然科学基金(编号:61672176)、广西“百人计划”团队创新研究项目、广西大数据智能与应用人才高地项目、广西科技计划项目(编号:AD20159039)、广西青年和中青年人才提升工程项目(编号:2020KY02032)、广西研究生教育创新项目(编号:YCSW2020110)的部分支持。

References

1.
B. Waters, J. Bethencourt and A. Sahai, "Ciphertext-policy attribute-based encryption", Proc. IEEE Symp. Secur. Privacy, vol. 10, pp. 321-334, 2007.
2.
"What is cloud computing", ACM, vol. 51, pp. 9-11, 2011.
3.
A. K. Iyengar, "Enhanced clients for data stores and cloud services", IEEE Trans. Knowl. Data Eng., vol. 31, no. 10, pp. 1969-1983, Oct. 2019.
4.
X. Ge, Y. Wang, J. Fu, J. Wu and L. Ping, "Cloud storage as the infrastructure of cloud computing", Proc. IEEE Int. Conf. Intell. Comput. Cogn. Informat., pp. 380-383, 2010.
5.
L. Weng, L. Amsaleg and T. Furon, "Privacy-preserving outsourced media search", IEEE Trans. Knowl. Data Eng., vol. 28, no. 10, pp. 2738-2751, Oct. 2016.
6.
S. Kamara, C. Papamanthou and T. Roeder, "CS2: A searchable cryptographic cloud storage system", pp. 380-383, 2011.
7.
K. Ren, B. Zhang, R. Xie, K. Yang and X. Jia, "Effective data access control for multi-authority cloud storage systems", IEEE Trans. Inf. Forensics Secu., vol. 8, no. 11, pp. 1790-1799, Nov. 2013.
8.
L. M. Gupta, K. A. Nielsen, M. G. Borlick and L. M. Gupta, "Method system and computed program product for distributed storage of data in a heterogeneous cloud", Int. Bus. Machines Corporation, vol. 10, 2019.
9.
H. Ancin, X. Chen, A. Jassal, D. H. Jung, G. B. Neustaetter and S. H. Puttergill, "Systems and method for facilitating access to private files using a cloud storage system", 2016.
10.
K. Lauter and S. Kamara, "Cryptographic cloud storage", Proc. Int. Conf. Financial Cryptogr. Data Secur., pp. 136-149, 2010.
11.
K. Kurosawa and Y. Ohtaki, "UC-secure searchable symmetric encryption", Proc. Int. Conf. Financial Cryptogr. Data Secur., pp. 285-298, 2012.
12.
K. Kurosawa and Y. Ohtaki, "How to update documents verifiably in searchable symmetric encryption", Proc. Int. Conf. Cryptology Netw. Secur., pp. 309-328, 2013.
13.
C. Papamanthou, E. Stefanov and E. Shi, "Practical dynamic searchable encryption with small leakage", Proc. Netw. Distrib. Syst. Secur. Symp., pp. 23-26, 2014.
14.
P.-A. Fouque, R. Bost and D. Pointcheval, "Verifiable dynamic symmetric searchable encryption:optimality and forward security", IACR Cryptol. ePrint Archive, vol. 2016, pp. 1-40, 2016.
15.
C. Wang, X. Yuan, Q. Wang, J. Zhu, Q. Li and K. Ren, "Enabling generic verifiable and secure data search in cloud services", IEEE Trans. Parallel Distrib. Syst., vol. 29, no. 8, pp. 1721-1735, Aug. 2018.
16.
G. Gong and Q. Chai, "Verifiable symmetric searchable encryption for semi-honest-but-curious cloud servers", Proc. IEEE Int. Conf. Commun., pp. 917-922, 2012.
17.
S. Kamara and C. Papamanthou, "Parallel and dynamic searchable symmetric encryption", Financial Cryptogr. Data Secur., vol. 23, pp. 258-274, 2013.
18.
W. Lou, Y. T. Hou, W. Sun, X. Liu and H. Li, "Catch you if you lie to me: Efficient verifiable conjunctive keyword search over large dynamic encrypted cloud data", Proc. IEEE Conf. Comput. Commun., pp. 2110-2118, 2015.
19.
C. Jutla, H. Krawczyk, M.-C. Roşu, D. Cash, S. Jarecki and M. Steiner, "Highly-scalable searchable symmetric encryption with support for boolean queries", Proc. Annu. Cryptol. Conf., pp. 353-373, 2013.
20.
S. Kamara, R. Curtmola, J. Garay and R. Ostrovsky, "Searchable symmetric encryption: Improved definitions and efficient constructions", Comput. Secur., vol. 19, pp. 895-934, 2011.
21.
C. Papamanthou, S. Kamara and T. Roeder, "Dynamic searchable symmetric encryption", Proc. ACM Conf. Comput. Commun. Secur., pp. 965-976, 2012.
22.
K. Ren, W. Lou, Q. Wang, C. Wang and J. Li, "Enabling public auditability and data dynamics for storage security in cloud computing", IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 5, pp. 847-859, May 2011.
23.
J. Zhao, X. He, S. G. Worku and C. Xu, "Secure and efficient privacy-preserving public auditing scheme for cloud storage", Comput. Electr. Eng., vol. 40, pp. 1703-1713, 2014.
24.
A. Juels, E. Stefanov and M. van Dijk, "Iris: A scalable cloud file system with efficient integrity checks", Proc. 28th Annu. Comput. Secur. Appl. Conf., pp. 229-238, 2012.
25.
S. Yu and J. Yuan, "Efficient public integrity checking for cloud data sharing with multi-user modification", Proc. IEEE Conf. Comput. Commun., pp. 2121-2129, 2014.
26.
W. Ogata and K. Kurosawa, "Efficient no-dictionary verifiable SSE", IACR Cryptol. ePrint Archive, vol. 2016, pp. 1-25, 2016.
27.
M. Walterbusch, B. Martens and F. Teuteberg, "Costing of cloud computing services: A total cost of ownership approach", Proc. Int. Conf. Syst. Sci., pp. 1563-1572, 2012.
28.
M. Livny, B. Berriman, E. Deelman, G. Singh and J. Good, "The cost of doing science on the cloud: The montage example", Proc. ACM/IEEE Conf. Supercomput., pp. 1-12, 2008.
29.
S. Jarecki et al., "Dynamic searchable encryption in very-large databases: Data structures and implementation", Proc. ACM/IEEE Conf. Supercomput., pp. 1-16, 2014.
30.
Y. Chen et al., "OceanStore: An architecture for global-scale persistent storage", Proc. ACM Int. Conf. Architectural Support Program. Lang. Operating Syst., pp. 190-201, 2000.