Journals & Magazines >IEEE Transactions on Knowledg... >Volume: 34 Issue: 7

ESVSSE: Enabling Efficient, Secure, Verifiable Searchable Symmetric Encryption
ESVSSE：实现高效、安全、可验证的可搜索对称加密

Under a Creative Commons License
根据知识共享署名许可协议

Abstract:Symmetric Searchable Encryption(SSE) is deemed to tackle the privacy issue as well as the operability and confidentiality in data outsourcing. However, most SSE schemes a...View more

Metadata

Abstract: 摘要：

Symmetric Searchable Encryption(SSE) is deemed to tackle the privacy issue as well as the operability and confidentiality in data outsourcing. However, most SSE schemes assume that the cloud is honest but curious. This assumption is not always applicable. And even if some schemes supported verification, integrity or freshness checking in a malicious cloud, but the performance and security functionalities are not fully exploited. In this paper, we propose an efficient SSE scheme based on B+-Tree and Counting Bloom Filter (CBF) which supports secure verification, dynamic updating, and multi-user queries. Comparing with the previous state of the arts, we design the new data structure CBF to support dynamic updating and boost verification. We also leverage the timestamp mechanism in the scheme to prevent the malicious cloud from launching a replay attack. The new designed CBF is like a front-engine to save user

′ s cost for query and verification. And it can achieve more efficient query and verification with negligible false positive when there is no value matching the queried keyword. The CBF supports efficient dynamic updating by combining Bloom Filter with a one-dimensional array that provides the counting capability. Furthermore, we design the authenticator for CBF. We adopt B+-Tree for it is widely used in many database engines and file systems. We also give a brief security proof of our scheme. Then we provide a detailed performance analysis. Finally, we evaluate our scheme through comprehensive experiments. The results are consistent with our analysis and show that our scheme is secure, and more efficient compared with the previous schemes with the same functionalities. The average performance can be improved by about 20 percent for both the cloud servers and users when the missing rate of the searching keywords is 20 percent. And the higher the missing rate is, the more the performance can be improved.
对称可搜索加密（SSE）被认为是解决数据外包中的隐私问题以及可操作性和保密性的方法。然而，大多数 SSE 方案假设云是诚实但好奇的。这种假设并不总是适用的。即使一些方案在恶意云中支持验证、完整性或新鲜性检查，但性能和安全功能并没有得到充分利用。在本文中，我们提出了一种基于 B+-Tree 和 Counting Bloom Filter（CBF）的高效 SSE 方案，支持安全验证、动态更新和多用户查询。与先前的技术水平相比，我们设计了新的数据结构 CBF，以支持动态更新并提升验证。我们还利用方案中的时间戳机制来防止恶意云发动重放攻击。新设计的 CBF 就像一个前置引擎，为查询和验证节省用户 ′ 的成本。当没有与查询关键字匹配的值时，它可以实现更高效的查询和验证，且误报率可以忽略不计。 CBF 通过将 Bloom Filter 与提供计数功能的一维数组相结合，支持高效的动态更新。此外，我们为 CBF 设计了认证器。我们采用 B+-Tree，因为它在许多数据库引擎和文件系统中被广泛使用。我们还对我们的方案进行了简要的安全性证明。然后我们提供了详细的性能分析。最后，我们通过全面的实验评估了我们的方案。结果与我们的分析一致，表明我们的方案与具有相同功能的先前方案相比既安全又更高效。当搜索关键字的缺失率为 20%时，云服务器和用户的平均性能可以提高约 20%。缺失率越高，性能提升越明显。

Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 34, Issue: 7, 01 July 2022)
发表在：IEEE 知识与数据工程交易（卷：34，期：7，2022 年 7 月 1 日）

Page(s): 3241 - 3254 页码：3241 - 3254

Date of Publication: 21 September 2020
发表日期：2020 年 9 月 21 日

ISSN Information: ISSN 信息：

DOI: 10.1109/TKDE.2020.3025348
DOI：10.1109/TKDE.2020.3025348

Funding Agency: 资助机构:

Contents

内容

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
CCBY - IEEE 不是这份资料的版权持有者。请按照 https://creativecommons.org/licenses/by/4.0/ 上的说明获取完整的文章和 API 文档中的规定。

SECTION 1 第一部分

Introduction 简介

Cloud computing has been widely deployed and become indispensable for many companies and institutions, due to its salient features such as on-demand services, elasticity, flexibility, multi-tenancy and efficient access to their data and reduced the maintenance cost [1]. Various cloud-based services and applications are gradually enriched, including many governments, educational institutions, medical institutions, and large enterprise groups [2]. As one of the key services, data outsourcing [3], [4], [5], [6], [7], [8], [9], [10] solves the problem of storing large amounts of data at low cost for users with limited storage resources and has been widely used.
由于其按需服务、弹性、灵活性、多租户和高效访问数据以及降低维护成本等显著特点，云计算已被广泛部署并成为许多公司和机构不可或缺的一部分 [1] 。各种基于云的服务和应用程序正在逐步丰富，包括许多政府、教育机构、医疗机构和大型企业集团 [2] 。作为关键服务之一，数据外包 [3] ， [4] ， [5] ， [6] ， [7] ， [8] ， [9] ， [10] 解决了用户存储资源有限的大量数据存储低成本的问题，并得到了广泛应用。

If data is directly stored in the cloud, users have the risk of revealing privacy. This is very critical especially for high-impact business data or medical records. For most the service providers don’t guarantee not to see or modify clients’ data. A cloud server also may be selfish in order to save its computation or transmission overhead [14], [16]. So it is essential to provide security and privacy guarantees such as confidentiality, integrity, freshness, authenticity, and verification in addition to user experience, conventional operability and management such as updating, retrieving, and supporting multi-user, for an elegant and practical outsourcing storage scheme.

In order to ensure data confidentiality without losing operability and management of the data, many SSE schemes [11], [17], [18], [19], [20], [21] are proposed at beginning stage. However, most of these SSE schemes assume that the cloud server is honest but curious [19], [20]. This assumption is not always valid. Because the cloud server may suffer from external attacks, internal configuration errors, software vulnerabilities, and internal personnel threats [11], [18]. The data may be tampered or the server may return a cached history or partial results.

To prevent malicious server from returning partial results or tampered results, integrity check should be supported. However, most of the schemes ignore this or other issues. For example, some existing SSE schemes do not guarantee the integrity of the search results returned by the server to the user [17], [18], [19], [20]. Although there were some schemes could ensure the integrity of data [6], [22] and some verifiable SSE schemes have also been extensively studied [11], [13], [14], [23], [24], [25], [26], unfortunately, these scenarios only support validation in static databases [11], [26]. [21] cannot prevent a malicious server from returning an empty result. [26] using cuckoo-hash solves the problem of the server returning null results. However, it does not support dynamic updates for data validation.

In addition, most of the currently proposed SSE schemes are two-party models between data owners and servers. [15] proposed a verifiable three-party model SSE scheme based on the Merkle Tree Index. The three-party model includes the data owner, users, and servers. However, one drawback of [15] is that the server needs to traverse the whole index when the keyword queried is not present. In this situation, the server will take some time to traverse the whole index, create the results for verification, then sends the results to the user. And the user will take some time to verify the results, but the user get an empty result. The user experience will be greatly affected in this scenario. There are many cases which can leads to this scenario such as querying a non-existent keyword in the index, spelling errors, hitting an adjacent key when typing, etc. And this drawback exists in most of the current SSE schemes. We will give a detailed analysis in later section.

Table 1 compares various existing verifiable SSE schemes with our proposed scheme. To the best of our knowledge, none of the existing verifiable SSE schemes can explicitly allow users to verify the absence of keywords in the three-party model efficiently. In summary, the performance and security functionalities are not fully exploited.

TABLE 1 Comparison With Existing Typical Verifiable SSE Schemes

Challenges. Although there have been many SSE schemes, there are still some challenges needed to be fully explored.

First, when a user makes a search query, how does the user verify the integrity of the results returned by the server? In our scheme, the data owner uses B+-Tree to build the index and hashes all nodes to generate a root using several hash functions. The user verifies the integrity of the search results through this root. We leverage two methods to generate the root of B+-Tree. One is to hash all the nodes to generate the root of the B+-Tree. The other is to hash only the leaf nodes to generate the root of the B+-Tree. Since its leaf nodes contain all key-value pairs, the integrity of the search results can also be verified. Although there are many applications based on B+-Tree, they cannot utilized directly in such context providing excellent privacy and security guarantees such as integrity checking, verification, etc.
Second, how does a scheme prevent malicious servers from launching replay attacks and support three-party model? [13] proposes a timestamp-based scheme to solve this problem, but it can be only applied to the two-party model. We address this issue by combining the timestamp mechanism with the root of the B+-Tree and authorized users.
Finally, costs and user experience are common issues for most of the current schemes. In traditional verifiable SSE schemes, whenever a keyword queried by the user exist or not, the server always needs to traverse the whole secure index and create the corresponding authenticator. It is not efficient for both of the server and the users. New idea should be adopted to differentiate the two situations to boost the performance and user experience. The difficulty is how to make the new designed scheme compatible with previous scheme and also keep the scheme supporting dynamic updating. Besides, it is essential to save users’ costs. For some cloud service provider not only charges based on the size of the storage resource requested by the client but also charges based on the amount of calculation [27], [28], [29]. How can the server avoid unnecessary searches? Therefore, it is necessary to improve the search efficiency of SSE. At present, most SSE schemes do not take into account charges. The server’s query efficiency is the same whether the queried tokens are present or absent.

Here, we design a specific CBF that supports efficient dynamic updating and verification to solve the final challenge. Our methodology is straightforward but effective. When the keyword queried by the user does not exist, the server only needs to return a Bloom Filter authenticator to the user and avoids to traverse the whole index and create the corresponding authenticator. The performance can be greatly improved and is independent of the scale of the dataset when the searching keyword is missing.

In Conclusion. Our contributions can be summarized as follows:

We propose an Efficient, Secure, and Verifiable SSE scheme under the three-party model. It can guarantee the integrity of the search query results returned by the server to the user and the freshness of the result.
Our secure index is based on B+-Tree and a specific CBF which supports secure verification, dynamic updating, and multi-user query. To the best of our knowledge, it is the first B+-Tree based SSE scheme which enables efficient verification, integrity checking, and dynamic updating.
The new designed CBF with counting capabilities supports efficient update and verification. It can greatly save both the server overhead for searching and users’ cost for verification, especially when the keyword queried by the user does not exist.
We verify the feasibility and safety of the scheme through experiments. The results show that the CBF can achieve stable, more efficient query and verification when the queried keyword does not exist.

Organization. The rest of the paper is organized as follows: Section 2 presents the related work. We introduce the threat model and our design goal in Section 3 and give a system overview of the proposed scheme in Section 4. The detailed construction is presented in Section 5. Then, we provide the security analysis in Section 6 and performance analysis and evaluation in Section 7 respectively. Finally, we present our conclusion in Section 8.

SECTION 2

Related Work

Encrypted Storage Outsourcing. [30], [31] allows the client with limited resources to outsource large amounts of data to cloud service companies at a low cost. Since the data were directly encrypted and stored in the cloud server, users cannot directly query on the encrypted data. If the user wants to update the database. Users are required to download the data locally. Then update the database and upload it to the server.

Searchable Encryption. In order to query the encrypted data of the cloud server, some relevant technologies were proposed to solve this problem [32], [33], [34], [35]. SSE is one of the most important schemes to solve this problem. SSE is a search query scheme based on the keyword index. It can perform operations such as querying on cloud-encrypted data. In the work [5], the authors provided a privacy-protection framework for outsourced media search. The work relies on multimedia hashing and symmetric encryption and tries to balance the strength of privacy enforcement, the quality of search, and the computation complexity. In general, we summarize the related work in the following.

Verifiable Searchable Symmetric Encryption. Some of the most recent work focus on forward security or backward security. Most of them consider honest but curious server which follow the defined protocol. Although these schemes can provide forward or backward security, the search results cannot be verified efficiently when the servers perform active attacks. Based on these, some schemes with verification functionality were proposed [12], [15], [36]. Relevant verifiable SSE encryption technology was proposed [12], [15], [36]. [21] proposed dynamic authenticators are used to checking the integrity of the results, but cannot prevent replay attacks by malicious servers. For the verifiable scheme proposed in [14], [24], [37], either the query efficiency is a little gloomy, or it cannot support the validation in the case of updating. The verifiable universal scheme proposed by [37] is to transform the symmetric searchable encryption scheme into a no-dictionary verifiable scheme. Users do not have to maintain a collection of all keywords, but it is a static approach. [13] solved it by using the message verification code, but it can’t detect that the server intentionally returned an empty result. [15] proposes a secure verification scheme, and it can prevent replay attacks from cloud servers. However, this solution does not solve the problem that the keyword of the user query does not occur.

Multi-User Searchable Encryption. Some work tried to enhance the functionality and performance for data outsourcing and cloud services. The original SSE schemes work in a two-party model. The natural extension of the two-party model of searchable encryption scheme, multi-user model search encryption schemes are constantly being proposed. [15], [20], [36] these scenarios are typical multi-user search encryption schemes. [20] is the first time to propose a multi-user model encryption scheme based on broadcast encryption. [15] leveraged Merkle Patricia Tree and Incremental Hash to build the proof index of supporting data updates. [36] is presented which utilizes a two-keyword index to reduce the searching time. However, [36] has a shortcoming. The index of the two keywords used in the paper is comprised of all the keywords and all combinations of two keywords from all the keywords. This will significantly increase the storage overhead and search overhead of the server. When the queried keyword does not exist, [15], [36] has the same efficiency as the keyword exists. This reduces the search efficiency when the keyword is absent. [38] proposed a non-interactive multi-user searchable encryption scheme that reduced the interactions between the data owner and users. However, the scheme did not support the update of data. [3] tries to utilize caches, encryption and compression to improve performance and reduce the size of data transfers. [39] studies the problem of secure skyline queries over encrypted data. These solutions are still not full-fledged such as performance, verification, etc.

SECTION 3

Problem Statement

In this section, we will describe the attack model of SSE. Then, we formally present our design goals.

3.1 Threat Model

We assume that the data owner is credible and the users authorized by the data owner are also trusted. The cloud servers store the encrypted index and documents and perform SSE. We consider the cloud servers are untrusted and malicious which may perform active attacks. The server might deduce some sensitive information from the search results. Besides, when a client lunch a query, the servers may return partial search results to the user. Or the servers may initiate replay attacks. We try to mitigate such attacks in our scheme.

Definition 1 Replay Attack.

Replay attacks of SSE are attempted by a malicious server (or attacker) to return search results to a user from a historical version of the dataset.

Definition 2 Data Integrity Attack.

Data integrity attacks in SSE is that a malicious server (or attacker) deliberately return partial search results or empty result to the user.

In addition, we don’t consider the access pattern leakage and query pattern leakage which can be tackled by oblivious methodologies. Also we don’t consider the information leakage through side channel or timing channel.

3.2 Design Goal

In this paper, our goal is to design an Efficient, Secure, Verifiable SSE scheme on the three-party model. We try to achieve the following goals:

The user can verify the correctness of the received result from the server.
The user can detect whether the server is launching a replaying attack.
The overall performance can be reasonably improved. For example, when the server receives the user′s query, the server can save the overall cost through our specific algorithm and data structure.
The proposed data structure and algorithm can support efficient dynamic updating.

SECTION 4

Overview Of ESVSSE

In this section, we will briefly describe our scheme. The main symbols used in this paper are given in Table 2.

TABLE 2 Notations

4.1 System Structure

Fig. 1 shows the overall structure of our scheme. It consists of data owners, cloud servers, and authorized users. The data owner stores the encrypted data in a cloud server. Authorized users can search the data of the cloud server. Data owners have administrative rights over data in the cloud.

Fig. 1.

The system architecture of ESVSSE on the three-party model.

ESVSSE: Enabling Efficient, Secure, Verifiable Searchable Symmetric EncryptionESVSSE：实现高效、安全、可验证的可搜索对称加密

Alerts

Introduction 简介

Related Work

Problem Statement

3.1 Threat Model

Definition 1 Replay Attack.

Definition 2 Data Integrity Attack.

3.2 Design Goal

Overview Of ESVSSE

4.1 System Structure

4.2 System Model

The Construction of ESVSSE

5.1 Building Secure Index

5.2 Query Procedure

5.3 Verifying the Search Results

5.4 Update

5.5 An Illustrative Example

Security Analysis

Proof.

Performance Analysis and Evaluation

7.1 Performance Analysis

7.2 Experiment Setup

7.3 Index Construction Cost

7.4 Performance of Update

7.5 Performance of Query

7.6 Performance of Verification

7.7 Storage Cost of Verification

7.8 Comparison With Existing Schemes

Future Direction and Conclusion未来方向和结论

ACKNOWLEDGMENTS 致谢

References

IEEE Account

Purchase Details

Profile Information

Need Help?

ESVSSE: Enabling Efficient, Secure, Verifiable Searchable Symmetric Encryption
ESVSSE：实现高效、安全、可验证的可搜索对称加密

Future Direction and Conclusion
未来方向和结论