沉浸式翻译

Soft Computing (2020) 24:4675–4691 https://doi.org/10.1007/s00500-019-04228-4
软计算（2020）24：4675–4691 https://doi.org/10.1007/s00500-019-04228-4

A fuzzy similarity-based rough set approach for attribute selection in set-valued information systems
一种基于模糊相似性的粗糙集方法，用于集合值信息系统中的属性选择

Shivani Singh¹ • Shivam Shreevastava² • Tanmoy Som² • Gaurav Somani²
希瓦尼·辛格¹希瓦姆·什里瓦斯塔瓦²坦莫伊·索姆²高拉夫·索马尼²

Published online: 23 July 2019
网络发布日期：2019 年 7 月 23 日

Springer-Verlag GmbH Germany, part of Springer Nature 2019
德国 Springer-VerlagGmbH，Springer Nature2019 的一部分

Abstract
抽象

Databases obtained from different search engines, market data, patients’ symptoms and behaviours, etc., are some common examples of set-valued data, in which a set of values are correlated with a single entity. In real-world data deluge, various irrelevant attributes lower the ability of experts both in speed and in predictive accuracy due to high dimension and insignificant information, respectively. Attribute selection is the concept of selecting those attributes that ideally are necessary as well as sufficient to better describe the target knowledge. Rough set-based approaches can handle uncertainty available in the real-valued information systems after the discretization process. In this paper, we introduce a novel approach for attribute selection in set-valued information system based on tolerance rough set theory. The fuzzy tolerance relation between two objects using a similarity threshold is defined. We find reducts based on the degree of dependency method for selecting best subsets of attributes in order to obtain higher knowledge from the information system. Analogous results of rough set theory are established in case of the proposed method for validation. Moreover, we present a greedy algorithm along with some illustrative examples to clearly demonstrate our approach without checking for each pair of attributes in set-valued decision systems. Examples for calculating reduct of an incomplete information system are also given by using the proposed approach. Comparisons are performed between the proposed approach and fuzzy rough- assisted attribute selection on a real benchmark dataset as well as with three existing approaches for attribute selection on six real benchmark datasets to show the supremacy of proposed work.
从不同搜索引擎获得的数据库、市场数据、患者的症状和行为等，是集值数据的一些常见例子，其中一组值与单个实体相关联。在现实世界的数据洪流中，由于高维度和不重要的信息，各种不相关的属性分别降低了专家在速度和预测准确性方面的能力。属性选择是选择那些理想情况下是必要且足以更好地描述目标知识的属性的概念。基于粗糙集合的方法可以处理离散化过程后实值信息系统中可用的不确定性。在本文中，我们介绍了一种基于公差粗糙集理论的集合值信息系统中的属性选择新方法。使用相似性阈值定义两个对象之间的模糊容差关系。我们根据依赖程度方法找到归约，用于选择最佳属性子集，以便从信息系统获得更高的知识。在所提出的验证方法的情况下，建立了粗糙集理论的类似结果。此外，我们提出了一个贪婪算法以及一些说明性示例，以清楚地演示我们的方法，而无需检查设定值决策系统中的每一对属性。使用所提出的方法还给出了计算不完全信息系统还原的示例。在真实基准数据集上对所提出的方法和模糊粗略辅助属性选择进行了比较，并在六个真实基准数据集上与三种现有的属性选择方法进行了比较，以表明所提出工作的至高无上。

Keywords Set-valued data Rough set Fuzzy tolerance relation Degree of dependency Attribute selection
关键字设置值数据粗糙集模糊容忍关系依赖度属性选择

Introduction
介绍

Communicated by V. Loia.
由V.Loia 传达。

& Shivam Shreevastava shivam.rs.apm@itbhu.ac.in

Shivani Singh shivanithakur030@gmail.com
希瓦尼·辛格 shivanithakur030@gmail.com

Tanmoy Som tsom.apm@itbhu.ac.in

Gaurav Somani gaurav.somani.mat14@itbhu.ac.in

¹ DST-Centre for Interdisciplinary Mathematical Sciences, Institute of Science, BHU, Varanasi 221005, India
¹ 印度瓦拉纳西 221005 BHU 科学研究所 DST-跨学科数学科学中心

² Department of Mathematical Sciences, IIT (BHU), Varanasi 221005, India
² 印度理工学院（BHU）数学科学系，印度瓦拉纳西221005

Many real applications in the area of machine learning and data mining consist of set-valued data, i.e. the data where the attribute value of an object is not unique but a set of values, for example, in a venture investment company, the set of evaluation results given by expert (Qian et al. 2010b), the set of languages for each person from the foreign language ability test (Qian et al. 2009), and in a medical database, the set of patients symptoms and activ- ities (He and Naughton 2009), etc. These kinds of infor- mation systems are called set-valued information systems, which are another important type of data tables and are generalized models of single-valued information systems. An incomplete information system (Kryszkiewicz 1998, 1999; Dai 2013; Dai and Xu 2012; Dai et al. 2013; Leung and Li 2003; Yang et al. 2011, 2012) can be con- verted into a set-valued information system by replacing all missing values with the set of all possible values of each
机器学习和数据挖掘领域的许多实际应用都是由定值数据组成的，即对象的属性值不唯一而是一组值的数据，例如，在一家风险投资公司中，专家给出的一组评估结果（Qian et al.2010b），外语能力测试中每个人的语言集（Qian et al. 2009 年），以及在医学数据库中，患者的症状和活动集（He 和 Naughton 2009 年）等。这类信息系统被称为集值信息系统，它是另一种重要的数据表类型，是单值信息系统的泛化模型。不完整的信息系统（Kryszkiewicz 1998， 1999;Dai 2013;Dai 和 Xu 2012;Dai 等人。 2013 年;Leung 和 Li 2003;Yang 等人。 2011， 2012）可以通过将所有缺失值替换为每个值的所有可能值的集合来转换为一个集值信息系统

attribute. Set-valued information systems are always used to portray the inexact and lost information in a given dataset, in which the attribute set may vary with time as new information is added.
属性。集合值信息系统始终用于描绘给定数据集中的不精确和丢失信息，其中属性集可能会随着新信息的添加而随时间而变化。

Dataset dimensionality is the main hurdle for the com- putational application in pattern recognition and other machine learning tasks. In many real-world applications, the generation and expansion of data occur continuously and thousands of attributes are stored in databases. Gathering useful information and mining-required knowl- edge from an information system is the most difficult task in the area of knowledge-based system. Not all attributes are relevant to the learning tasks as they reduce the real performance of proposed algorithms and increase the training and testing times. In order to enhance the classi- fication accuracy and knowledge prediction, attribute sub- set selection (Hu et al. 2008; Yang and Li 2010; Qian et al. 2010a; Qian et al. 2011; Shi et al. 2011; Jensen and Shen 2009; Jensen et al. 2009) plays a key role via eliminating redundant and inconsistent attributes. Attribute selection is the process of selecting the most informative attributes of a given information system to reduce the classification time, complexity and overfitting.
数据集维数是模式识别和其他机器学习任务中计算应用的主要障碍。在许多实际应用程序中，数据的生成和扩展是连续发生的，并且数据库中存储了数千个属性。从信息系统中收集有用的信息和挖掘所需的知识是知识系统领域中最困难的任务。并非所有属性都与学习任务相关，因为它们会降低所建议算法的实际性能并增加训练和测试时间。为了提高分类准确性和知识预测，属性子集选择（胡等人。 2008Yang 和Li 2010Qian et al. 2010 年a;Qian 等人。 2011 年;Shi et al. 2011 年;Jensen 和 Shen 2009;Jensen 等人。 2009 年）通过消除冗余和不一致的属性发挥了关键作用。属性选择是选择给定信息系统信息量最大的属性以减少分类时间、复杂性和过度拟合的过程。

Rough set approximations (proposed by Pawlak 1991 and Pawlak and Skowron 2007a, b, c) are the central point of approaches to knowledge discovery. Rough set theory (RST) uses only internal information and does not depend on prior model conventions, which can be used to extract and signify the hidden knowledge available in the infor- mation systems. It has many applications in the fields of decision support, document analysis, data mining, pattern recognition, knowledge discovery and so on. In rough sets, several discrete partitions are needed in order to tackle real-valued attributes and then dependency of decision attribute over conditional attributes is calculated. The intrinsic error due to this discretization process is the main issue while computing the degree of dependency of real- valued attributes.
粗略集近似值（由 Pawlak 1991和 Pawlak 和 Skowron 2007a、b、c 提出）是知识发现方法的中心点。粗糙集论（RST）仅使用内部信息，不依赖于先前的模型约定，这些信息可用于提取和表示信息系统中可用的隐藏知识。它在决策支持、文档分析、数据挖掘、模式识别、知识发现等领域有许多应用。在粗略集中，需要几个离散分区来处理实值属性，然后计算决策属性对条件属性的依赖性。在计算实值属性的依赖程度时，由于这种离散化过程导致的内在误差是主要问题。

Dubois and Prade (1992) combines a fuzzy set (Zadeh 1996) with rough set and proposed a fuzzy rough set to provide an important tool in reasoning with uncertainty for real-valued datasets. Fuzzy rough sets combine distinct concepts of indiscernibility (for rough sets) and vagueness (for fuzzy sets) available in the datasets and successfully applied to many fields. However, very few researchers are working in the area of set-valued information systems under the framework of rough set model in fuzzy environment.
Dubois 和 Prade （1992）将模糊集（Zadeh 1996）与粗糙集相结合，并提出了模糊粗糙集，为实值数据集的不确定性推理提供了重要工具。模糊粗糙集结合了数据集中可用的不可辨别性（对于粗糙集）和模糊性（对于模糊集）的不同概念，并成功应用于许多领域。然而，很少有研究人员在模糊环境下的粗糙集模型框架下从事集值信息系统领域的工作。

In this paper, we introduce a novel approach for attribute selection in set-valued information system based on toler- ance rough set theory. We define a fuzzy relation between two objects of a set-valued information system. A fuzzy tolerance relation is introduced by using a similarity
在本文中，我们介绍了一种基于容差粗糙集理论的集合值信息系统中的属性选择新方法。我们定义了一个集合值信息系统的两个对象之间的模糊关系。通过使用相似度引入模糊容差关系

threshold to avoid misclassification and perturbation in order to tackle uncertainty in a much better way. Based on this relation, we calculate tolerance classes of each object to determine lower and upper approximations of any subset of the universe of discourse. Positive region of decision attribute over a subset of conditional attributes can be calculated using lower approximations. Degree of depen- dency of decision attribute over a subset of conditional attributes is the ratio of cardinality of positive region and cardinality of the universe of discourse. Analogous results of rough set theory are established in case of our proposed method for validation. Moreover, we present a greedy algorithm to clearly demonstrate our approach without calculating degree of dependencies for each pair of attri- butes. Illustrative example datasets are given for better understanding of our proposed approach. We compare the proposed approach with other existing approaches on real datasets and test the statistical significance of the obtained results.
threshold 来避免错误分类和扰动，以便以更好的方式解决不确定性。基于这种关系，我们计算每个对象的容忍等级，以确定话语宇宙的任何子集的下限和上限近似值。可以使用较低的近似值来计算条件属性子集上的决策属性的正区域。决策属性对条件属性子集的依赖程度是正区域基数与话语宇宙基数的比率。在我们提出的验证方法的情况下，建立了粗糙集理论的类似结果。此外，我们提出了一种贪婪算法来清楚地演示我们的方法，而无需计算每对属性的依赖程度。给出了说明性示例数据集，以便更好地理解我们提出的方法。我们在真实数据集上将所提出的方法与其他现有方法进行比较，并测试所获结果的统计意义。

The rest of the paper is structured as follows. Related works are given in Sect. 2. In Sect. 3, basic definitions related to incomplete and set-valued information systems are given. The proposed concept for set-valued datasets is presented and thoroughly studied in Sect. 4. In Sect. 5, analogous results of rough set theory are verified for the new proposed approach. An algorithm for attribute selec- tion of set-valued information system is presented in Sect. 6. Illustrative examples with comparative analysis are given to demonstrate proposed model in Sect. 7. In Sect. 8, experimental analysis is performed on six real benchmark datasets. Section 9 concludes our work.
本文的其余部分结构如下。相关著作在 Sect.2.在第3、给出了与不完备和集值信息系统相关的基本定义。在 Sect 中提出并深入研究了集值数据集的拟议概念。4.在第5，为新提出的方法验证了粗糙集理论的类似结果。Sect. 中介绍了一种用于设置值信息系统属性选择的算法。 6给出了带有比较分析的说明性示例，以演示Sect. 7在第 7 节中。 8、对 6 个 Real Benchmark 数据集进行实验分析。第 9 节结束了我们的工作。

Related works
相关作品

Nowadays, set-valued datasets are generated through many sources. Dimensionality reduction is a key issue for such type of datasets in order to reduce complexity, time and cost. Different criteria have been proposed by a few researchers to deal with set-valued datasets and to evaluate the best suitable attributes in the process of attribute selection. Lipski (1979, 1981) gave the idea of representing an incomplete information system as a set-valued infor- mation system and studied their basic properties. He also investigated the semantic and logical problems often occur in an incomplete information system. Concepts of internal and external interpretations are introduced in the paper. Internal interpretation is shown to lead towards the notion of topological Boolean algebra and a modal logic, whereas external interpretation is related to referring queries directly to reality leads towards Boolean algebra and classical logic. Orlowska and Pawlak (1984) established a method to deal with non-deterministic information system
如今，集值数据集是通过许多来源生成的。降维是此类数据集的一个关键问题，以降低复杂性、时间和成本。一些研究人员提出了不同的标准来处理设定值数据集，并在属性选择过程中评估最合适的属性。Lipski（1979， 1981）提出了将不完备信息系统表示为固定值信息系统的想法，并研究了它们的基本性质。他还研究了不完整信息系统中经常出现的语义和逻辑问题。本文介绍了内部和外部解释的概念。内部解释被证明会导致拓扑布尔代数和模态逻辑的概念，而外部解释与将查询直接引用到现实有关，导致布尔代数和经典逻辑。Orlowska 和 Pawlak （1984）建立了一种处理非确定性信息系统的方法

which is considered as set-valued data. They defined a language in order to define non-deterministic information and introduced the concept of knowledge representation system.
被视为集值数据。他们定义了一种语言来定义非确定性信息，并引入了知识表示系统的概念。

A generalized decision logic, which is an extension of decision logic studied by Pawlak, in interval set-valued information system is presented by Yao and Liu (1999). They introduced two types of satisfiabilities of a formula, namely interval degree truth and interval-level truth. They also proposed generalized decision logic DGL and inter- preted this concept based on two types of satisfiabilities. A detailed discussion on inference rules is also presented. Yao (2001) presented a concept of granulation for a uni- verse of discourse in set-valued information systems and reviewed the corresponding approximation structure. The concept of ordered granulation and approximation struc- tures are used in defining stratified rough set approxima- tions. He first defined a nested sequence of granulations and then corresponding nested sequence of rough set approximations, which leads to a more general approxi- mation structure.
Yao 和 Liu （1999）提出了一种广义决策逻辑，它是 Pawlak 研究的决策逻辑的扩展，在区间集值信息系统中。他们引入了公式的两种满足性，即区间度真值和区间级真值。他们还提出了广义决策逻辑 DGL，并根据两种类型的满足性对这一概念进行了解释。此外，还详细讨论了推理规则。Yao （2001）提出了一个用于集合值信息系统中单宇宙话语的颗粒化概念，并回顾了相应的近似结构。有序粒化和近似结构的概念用于定义分层粗集近似值。他首先定义了一个嵌套的颗粒序列，然后定义了相应的粗略集近似的嵌套序列，这导致了更通用的近似结构。

Shoemaker and Ruiz (2003) introduced an extension of apriori algorithm that is able to mine association rules from a set-valued data. They introduced two different algorithms for mining association rules from set-valued data and compared their outcomes. They established a system based on one of these algorithms and applied it on some bio- logical datasets for justification. Set-valued information systems were presented by Guan and Wang (2006). To classify the universe of discourse, they proposed a toler- ance relation and used maximal tolerance classes. They introduced the concept of relative reduct of maximum tolerance classes and used Boolean reasoning technique for calculating relative reduct by defining a discernibility function. The concepts of E-lower, A-upper and A-lower relative reducts for set-valued decision systems are also discussed in details.
Shoemaker 和 Ruiz （2003）引入了 apriori 算法的扩展，该算法能够从集合值数据中挖掘关联规则。他们引入了两种不同的算法，用于从设定值数据中挖掘关联规则，并比较了它们的结果。他们基于其中一种算法建立了一个系统，并将其应用于一些生物数据集以进行合理性验证。集合值信息系统由 Guan 和 Wang （2006）提出。为了对话语世界进行分类，他们提出了一种容忍关系并使用了最大容忍类别。他们引入了最大公差等级的相对还原的概念，并使用布尔推理技术通过定义可辨别函数来计算相对还原。还详细讨论了设定值决策系统的 E-lower、A-upper 和 A-lower 相对还原的概念。

For the conjunctive/disjunctive type of set-valued ordered information systems, a dominance-based rough set approach was introduced by Qian et al. (2010b). This model is based on substitution of indiscernibility relation by a dominance relation. They also developed a new approach to sorting for objects in disjunctive set-valued ordered information systems. This approach is useful in simplifying a disjunctive set-valued ordered information system. Criterion reduction for a set-valued ordered information system is also discussed. Based on variable precision relation, Yang et al. (2010) generalized the notion of Qian et al. by defining an extended rough set model and propounded variable precision dominance relation for set- valued ordered information systems. They presented an attribute reduction method based on the discenibility matrix approach by using their proposed relation.

Zhang et al. (2012) proposed matrix approaches based on rough set theory with dynamic variation of attributes in set-valued information systems. In this paper, they defined the lower and upper approximations directly by using basic vector generated by the relation matrix in the set-valued information system. The concept of updation of the lower and upper approximations is also introduced by use of the variation of the relation matrix. Luo et al. (2013) investi- gated the updating mechanisms for computing lower and upper approximations with the variation of the object set. Authors proposed two incremental algorithms for the updation of the defined approximations in disjunc- tive/conjunctive set-valued information systems. After experiments on several datasets for checking the perfor- mance of the proposed algorithms, they showed that the incremental approaches are way better than the non-in- cremental approaches.

Wang et al. (2013) defined a new fuzzy preference relation and fuzzy rough set technique for disjunctive-type interval and set-valued information systems. They dis- cussed the concept of relative significance measure of conditional attributes in interval and set-valued decision systems by using the degree of dependency approach. In this paper, authors mainly focused on semantic interpre- tation of disjunctive type only. They also presented an algorithm for calculating fuzzy positive region in interval and set-valued decision systems.

An incremental algorithm was designed to reduce the size of dynamic set-valued information systems by Lang et al. (2014). They presented three different relations and investigated their basic properties. Two types of discerni- bility matrices based on these relations for set-valued decision systems are also introduced. Furthermore, using the proposed relations and information system homomor- phisms, a large-scale set-valued information system is compressed into a smaller information system. They addressed the compression updating via variations of fea- ture set, immigration and emigration of objects and alter- ations of attribute values.

In set-valued ordered decision systems, Luo et al. (2014) worked on maintaining approximations dynamically and studied the approximations of decision classes by defining the dominant and dominated matrices via a dominance relation. The updating properties for dynamic maintenance of approximations were also introduced, when the evolu- tion of the criteria values with time occurs in the set-valued decision system. Firstly, they constructed a matrix-based approach for computing lower and upper approximations of upward and downward unions of decision classes. Fur- thermore, incremental approaches for updating approxi- mations are presented by modifying relevant matrices without retraining from the start on all accumulated train- ing data.

Shu and Qian (2014) presented an attribute selection method for set-valued data based on mutual information of the unmarked objects. Mutual information-based feature selection methods use the concept of dependency among features. Unlike the traditional approaches, here the mutual information is calculated on the unmarked objects in the set-valued data. Furthermore, mutual information-based feature selection algorithm is developed and implemented on an universe of discourse to fasten the feature selection process. Due to the dynamic variation of criteria values in the set-valued information systems, Luo et al. (2015) pre- sented the properties for dynamic maintenance of approx- imations. Two incremental algorithms for modernizing the approximations in disjunctive/conjunctive set-valued information system are presented corresponding to the addition and removal of criteria values, respectively.

Most of the above approaches are based on classical and rough set techniques, which have their own limitations of discretization, which leads to information loss. Rough set in fuzzy environment-based methods deal with uncertainty as well as noise available in information system in a much better way as compared to classical and rough set-based approaches without requirement of any discretization pro- cess. Dai et al. (2013) defined a fuzzy relation between two objects and constructed a fuzzy rough set model for attri- bute reduction in set-valued information systems based on discernibility matrices. In this paper, the similarity of two objects in set-valued information system is taken up to a threshold value in order to avoid misclassification and perturbation and a tolerance rough set-based attribute selection is presented by using degree of dependency approach.

Preliminaries

In this section, we describe some basic concepts, symbol- ization and examples of set-valued information system.

Deﬁnition 3.1 (Huang 1992 ) A quadruple IS ¼ ð U ; AT ; V ; h Þ is called an Information System, where U ¼ f u ₁; u ₂; .. . ; u _ng is a non-empty finite set of objects, called the universe of discourse, AT ¼ f a ₁; a ₂; .. . ; a _ng is a non- empty finite set of
之 attributes.
属性。 V ¼ a2AT
在V_a where
哪里 V_a is
是 the set
套装 of
之 attribute
属性 values
值 associated
相关 with
跟 each
每 attribute
属性 a 2

AT and h : U AT ! V is an information function that assigns particular values to the objects against attribute set such that 8a 2 AT,8u 2 U; hðu; aÞ 2 V_a
AT和h ： UAT ！V是一个信息函数，它根据属性集为对象分配特定值，以便 8a 2 AT，8 u 2 U;hðu;aþ 2 伏_a.

Deﬁnition 3.2 (Huang 1992): In an information system, if each attribute has a single entity as attribute value, then it is called single-valued information system, otherwise it is known as set-valued information system. Set-valued
定义3.2（Huang1992）：在一个信息系统中，如果每个属性都有一个实体作为属性值，那么它被称为单值信息系统，否则称为集合值信息系统。Set-value （设置值）

Table 1 表1	Set-valued information system 集值信息系统
U	c1 c2	c₃	c₄
u₁ 你₁	{1,2,3,4} {0,1}	{1,2}	0.4
u₂ 你_两个	{2,3} {2,3}	{1}	0.5
u₃ 你_三个	{1,2,3,4} {1,2}	{1,2}	0.9
u₄ 你_四个	{2,3,4} {0,1,2,3}	{0,1)	0.2
u₅ 你₅	{2,4} {0,1,2}	{0,1}	1

information system is a generalization of the single-valued information system, in which an object can have more than one attribute values. Table 1 illustrates a set-valued infor- mation system.
信息系统是单值信息系统的泛化，其中对象可以具有多个属性值。表 1说明了一个固定值信息系统。

Deﬁnition 3.3 (Guan and Wang 2006): An information system IS ¼ ðU; AT; V ; hÞ is said to be a set-valued deci- sion system if AT ¼ C [ D where C is a non-empty finite set of conditional attributes and D is a non-empty collec- tion of decision attributes with C \ D ¼ ;. Here V ¼ V_C [ V_D with V_C and V_D as the set of conditional attribute values and decision attribute values, respectively. h be a mapping from U C [ Dto V such that h : U C ! 2^VC is a set-valued mapping and h : U C ! V_D is a single- valued mapping. Table 2 exemplifies a set-valued decision system.
定义 3.3（Guan 和 Wang 2006）：信息系统IS1/4ðU;AT;五;如果 AT 1/4C[D，则称 hÞ为集值决定系统，其中 C 是非空的有限条件属性集，D 是 C\D 为 1/4 的非空决策属性集合;这里V1/4 V_C[V_D，其中 V_C和 V_D分别作为条件属性值和决策属性值的集合。 h 是从UC [D到V 的映射，使得h：UC！2^VC是固定值映射，h ： UC ！V_D是单值映射。表 2举例说明了一个设定值决策系统。

To give a semantic interpretation of the set-valued data, many ways are given (Guan and Wang 2006), here we encapsulate them as two types. In Type1, hðu; aÞ is inter- preted conjunctively, and in Type2, hðu; aÞ is interpreted disjunctively. For example, if a is an attribute, ‘‘speaking a language’’, then hðu; aÞ = {Chinese, Spanish, English} can be inferred as: u speaks Chinese, Spanish and English in case of Type1 and u speaks Chinese or Spanish or English,
为了给出集合值数据的语义解释，给出了许多方法（Guan 和 Wang 2006），这里我们将它们封装为两种类型。在 Type1 中，hðu;aÞ 是连词，在 Type2 中，hðu;aÞ 是析取解释的。例如，如果a是一个属性，''说一种语言''，那么hðu;aÞ={中文，西班牙语，英文}可以推断为：在 Type1 的情况下，u 会说中文、西班牙语和英语，而u会说中文、西班牙语或英语，

i.e. u can speak only one of them in case of Type2. Incomplete information systems with some unknown attribute values or partially known attribute values are of Type2 set-valued information system.
即，在 Type2 的情况下，您只能说其中一种。具有一些未知属性值或部分已知属性值的不完整信息系统属于 Type2 集值信息系统。

In many real-world application problems, lots of missing data existed in the information system due to ambiguity and incompleteness. All missing values presented in
在许多实际应用问题中，由于模糊性和不完整性，信息系统中存在大量缺失数据。中显示的所有缺失值

Table 2 Set-valued decision system
表2设置值决策系统

U	c₁	c₂	c₃	c₄	D
u₁ 你₁	{1,2,3,4}	{0,1}	{1,2}	0.4	1
u₂ 你_两个	{2,3}	{2,3}	{1}	0.5	1
u₃ 你_三个	{1,2,3,4}	{1,2}	{1,2}	0.9	2
u₄ 你_四个	{2,3,4}	{0,1,2,3}	{0,1)	0.2	1
u₅ 你₅	{2,4}	{0,1,2}	{0,1}	1	2

Table 3 Missing value dataset
表3缺失值数据集 _

For B AT, a tolerance relation is defined as
对于BAT，公差关系定义为

2 _ \ _

T_B ¼ u_i; u_j jbðu_iÞ \ b u_j 6¼ /; 8b 2 B ¼ T_b ð2Þ
T_B1/4uubðuÞbu61/4/8b2B1/4T_bð2Þ

b2B
乙2乙

Table 4 Set-valued decision system obtained from Table 3
表4从表3 得到的设定值决策系统

information system can be characterized by the set of all possible values of each attribute. This type of information systems can also be considered as a special case of set-
信息系统可以通过每个属性的所有可能值的集合来表征。这种类型的信息系统也可以被视为set-

where u_i; u_j 2 T_B implies that u_i and u_j are indiscernible (tolerant) with respect to a set of attributes B:
其中u;u2 T_B意味着 u和 u对于一组属性 B 是不可辨别的（宽容的）：

Example 4.1 Let ðU; AT; V; hÞ be a set-valued information system with b 2 AT and u₁; u₂; u₃ 2 U such that bðu₁Þ ¼ fv₁; v₂; v₃; v₂g; bðu₂Þ ¼ fv₄; v₅; v₆; v₇g. and bðu₃Þ ¼ fv₁; v₂; v₃g: Then, by Definition 4.1, we say that both ðu₁; u₂Þ and ðu₁; u₃Þ belong to T_b, that is, u₁; u₂ are indiscernible with respect to attribute b and u₁; u₃ are indiscernible with respect to attribute b simultaneously.
例4.1设ðu;AT;五;hÞ是一个集合值信息系统，b 2 AT和u₁;u₂;u₃2 U使得 bðu₁Þ1/4fv₁; 第₂ 节; 第₃ 节; v₂克; bðu₂Þ1/4fv₄; 第_{5 节}; 第₆ 节; v₇克。和 bðu₃Þ1/4fv₁;第₂ 节;v₃g：那么，根据定义 4。1，我们说两者都是 ðu₁;u₂Þ 和 ðu₁;u₃Þ 属于 T_b，即 u₁;u₂ 对于属性 b 和 u₁ 是无法区分的;u₃ 同时对于属性 B 是无法辨别的。

It is obvious from above example that discernibility of u₁ and u₃ is more difficult than discernibility of u₁ and u₂, but Definition 4.1 is not able to describe the extent to which two objects are related. To overcome this issue, we define a fuzzy relation for a set-valued dataset.
从上面的例子中可以明显看出，u₁ 和 u₃ 的可辨别性比 u₁ 和 u₂ 的可辨别性更困难，但定义4.1无法描述其程度哪个两个对象相关。为了克服这个问题，我们为一个集值数据集定义了一个模糊关系。

Deﬁnition 4.2 Let SVIS ¼ ðU; AT; V; hÞ; 8b 2 AT be a set-valued information system, then we define a fuzzy
定义4.2设 SVIS1/4ðU;AT;五;hÞ; 8b2AT 为一个集合值信息系统，则定义一个模糊

relation R_b as:
关系R_b为：

2 bðu Þ \ b u

l_ u_i; u
你你_j

¼ jbðu Þj þ b u ð3Þ
1/4bðuÞþbuð3Þ

of objects are missing. Table 4 illustrates the transforma-
的对象缺失。表4说明了transforma-

tion of incomplete information system into a set-valued information system.
将不完整的信息系统转化为固定值的信息系统。

For a set of attributes B
对于一组属性 B

defined as
定义为

A, a fuzzy relation R_B can be
A是模糊关系 _R，B可以是

l_ u_i; u_j ¼ inf l_ u_i; u_j ð4Þ
luu 1/4influuð4Þ

Proposed methodology
建议的方法

In this section, we define a new kind of fuzzy relation
在本节中，我们定义了一种新的模糊关系

Example 4.2 (Continued from Example 4.1). After cal- culating degree of similarity by using fuzzy relation as defined in Eq. (3), we get
例 4.2（接例 4.1）。使用方程（3）中定义的模糊关系计算相似度后，我们得到

between two objects and supremacy over previously defined relation is shown using an example. Then, lower and upper approximations are defined using a threshold value on fuzzy similarity degree. Some basic properties of above-defined lower and upper approximations are also
Between two objects 和 prepremacy over previously defined relation 通过示例来显示。然后，使用模糊相似度的阈值定义下限和上限近似值。上面定义的lower和upper近似的一些基本属性也是

l_ ðu₁; u₂Þ ¼
lðu₁u₂Þ1/4

R_b

l_ ðu₁; u₃Þ ¼
lðu₁u₃Þ1/4

R_b

2jbðu₁Þ \ bðu₂Þj jbðu₁Þj þ jbðu₂Þj 2jbðu₁Þ \ bðu₃Þj jbðu₁Þj þ jbðu₃Þj
2bðu₁Þbðu₂Þj bðu₁Þj þ bðu₂Þj 2bðu₁Þbðu₃Þj jbðu₁Þjþbðu₃Þj

¼ 8 ¼ 0:25
1/481/40：25

¼ 7 ¼ 0:86
1/471/40：86

discussed.
讨论。

Fuzzy relation between objects
对象之间的模糊关系

In this subsection, first we present the definition of toler- ance relation available in the literature and then propose a new fuzzy relation between two objects. In continuation, we compared both the definition through an example.
在本节中，我们首先介绍了文献中可用的公差关系的定义，然后提出了两个对象之间的新模糊关系。接下来，我们通过一个例子比较了这两个定义。

Deﬁnition 4.1 (Orlowska (1985), Yao (2001)) For a set- valued information system IS ¼ ðU; AT; V ; hÞ; 8b 2 AT and u_i; u_j 2 U; tolerance relation is defined as
定义 4.1（Orlowska （1985）， Yao （2001））对于一个有值信息系统，IS1/4ðU;AT;五;hÞ;8b2AT 和 u;u2 u; tolerance 关系定义为

Now, we can easily calculate the degree to which two
现在，我们可以很容易地计算出两个

objects are discernible. h
物体是可辨别的。h

Fuzzy tolerance relation-assisted rough approximations
模糊容差关系辅助的粗略近似值

In this subsection, a fuzzy tolerance relation using a threshold value is defined and lower and upper approxi- mations of a set are presented
在本小节中，定义了使用阈值的模糊容差关系，并给出了一组的下限和上限近似值

If we ignore some misclassification and perturbation by using a threshold value on fuzzy relation between two objects as given in Eq. (3), then the involvement of fuzzy
如果我们使用方程（3）中给出的两个对象之间的模糊关系的阈值来忽略一些错误分类和扰动，那么模糊

sets in the computation of fuzzy lower approximation will increase and fuzzy positive region enlarges. Thus, the knowledge representation ability becomes much stronger with respect to misclassification.
模糊下限近似计算中的集合将增加，而模糊正区域将扩大。因此，知识表示能力在错误分类方面变得更加强大。

So we define a new kind of binary relation using a
因此，我们使用

threshold value a as follows:
阈值A如下所示：

attribute b: So, our proposed definition gives more precise tolerance relation than previous ones.
属性 b：因此，我们提出的定义给出了比以前的定义更精确的公差关系。

Now, we define tolerance classes for an object u_i with respect to b 2 B as follows:
现在，我们定义对象 u相对于 b 2 B 的容差类，如下所示：

a _{_}a
一个

½ ðu Þ ¼ u 2 Uj ð7Þ
1/2ðuÞ1/4u2Uj ð7Þ

a n o
一个no

Tb i

_j u_iTbu
乌特布_j

where a 2 ð0; 1Þ is a similarity threshold, which gives a level of similarity for insertion of objects within tolerance
其中a2ð0;1Þ是相似度阈值，它给出了在容差范围内插入对象的相似度

½_ a ðu Þ ¼ nu
1/2aðuÞ1/4nu

_-a

_j 2 Uju_iTbu
2UuTb u_j;

8b 2 Bo ð8Þ

classes.
类。

For a set of attributes B AT; we define binary relation
对于一组属性 BAT;我们定义二元关系

Then we propose, lower and upper approximations of any object set X U as:
然后我们提出，任何对象集 XU 的 lower和upper近似为：

as:
如：

a n h_ ai o
一个nhao

T ¼ n u_i; u
T1/4nuu_j

jl_

u_i; u
uu u_j

ao
一个O

ð6Þ

Tb # X ¼ u_i 2 Uj Tb
Tb#X1/4u2UTb

ðu_iÞ X

ð9Þ

where l
其中 l_

R_b

u_i; u
uu u_j

and l
和 l_

R_B

u_i; u
uu u_j

are defined by Eqs. (3)
由Eqs 定义。（3）

a " X ¼ nu
a“X1/4nu

_-a

Tb ðu_iÞ \ X 6¼ /
TbðuÞX61/4/

a _{_}a
一个

ð10Þ

and (4), respectively.
和（4）。

Deﬁnition 4.3 A fuzzy binary relation R~ ðu ; u Þ between
定义4.3模糊二元关系R~ðu;uÞ之间

The tuple \Tb # X; Tb " X [ is called a tolerance rough set.
元组\Tb#X;Tb“X[称为公差粗略集。

b i j

objects u_i; u_j 2 U is said to be a fuzzy tolerance relation if it is reflexive ði:e: R~bðu_i; u_iÞ¼ 1; 8u_i 2 UÞ and symmetric i:e: R~bðu_i; u_jÞ ¼ R~bðu_j; u_iÞ; 8u_i; u_j 2 U
对象 u;u2如果 U是反身的 ði：e：R~bðu，则称其为模糊容差关系;uÞ1/41;8u2UÞ和对称 i：e：R~bðu;uÞ1/4R~bðu;uÞ;8u;你2你

_-a

Lemma 4.1 Tb is a tolerance relation.
引理4.1Tb是一个公差关系。

Proof
证明

Reflexive:
反身：

Properties of lower and upper approximations
下限和上限近似的属性

In this subsection, we will examine the results on lower and upper approximations equivalent to Dubois and Prade (1992) for our proposed approach
在本小节中，我们将检查我们提出的方法等效于 Dubois 和 Prade （1992）的 lower 和 upper 近似的结果

Let ðU; C [ D; V ; hÞ be a set-valued decision system.
设ðU;C[D;五;hÞ是一个设定值决策系统。

Let B C and X U,a 2 ð0; 1Þ
设bc和xua2ð01þ

_-a _{_}a
一个

2 bðu_iÞ \ b u_j

Theorem 4.1 TB # X X TB " X
定理4.1TB#XX XTB“X

*l_ u_i; u_j ¼ ) l_ ðu_i; u_iÞ jbðu Þj þ b u
*lu; u1/4 ）lðuuÞ jbðu Þþbu

_a h _a
_一个H_A i

¼ ⁱ ⁱ ¼ ⁱ ¼ 1 a
1/4 1/4 1/41安

h_ a
哈a i

jbðu_iÞj þ jbðu_iÞj jbðu_iÞj

_-a

Since x 2 TB ðxÞ ) x 2 X
自 x 2 TB ðxÞ ） x 2 x:

_-a

) ðu_i; u_iÞ 2 Tb
）ðuuÞ2Tb

Therefore, TB # X X
因此，TB # X X:

Now, let x 2 X, since x 2 h_ a iðxÞ ) h_ a iðxÞ\ X 6¼
现在，设x2Xsincex2haðxÞ）haðxÞX61/4

Symmetric:
对称：

Let u ; u 2 _ a
让你两个

_-a

u ) x 2 " X
u）x 2 英寸 X

TB TB

i j Tb

2jbðu_iÞ\bðu_jÞj a

_-a

Therefore, X " X
因此，X“X

Now, l
现在，l_

u_j; u_i ¼
你你1/4

2jbðu_jÞ\bðu_iÞj

¼ l
1/4 升_

u_i; u_j a
你你

Hence, TB # X X TB " X h
因此，TB#XX TB“Xh

R_b

) u_j; u
你你_i

2 Tb
2吨b

jbðu_jÞjþjbðu_iÞj R_b
bðuÞþ bðuÞR_b

Theorem 4.2 Let B₁
定理4.2设B₁

B₂
乙₂

C, then
C，则

_-a _{_}a _{_}a
一个AA

Therefore, Tb is a fuzzy tolerance relation. h (i) TB1 # X TB2 # X
因此，Tb是一个模糊容差关系。h（i）TB# XTB# X

_-a _{_}a
一个

Example 4.3 If we take a ¼ 0:3 in Eq. (5) and apply it on
例4.3如果我们取方程（5）中的1/403 并将其应用于

Example 4.1, then we can see that only ðu₁; u₃Þ belongs to
例4.1那么我们可以看到只有ðu₁;u₃Þ属于

T_b, i.e. only u₁ and u₃ are indiscernible with respect to
T_b即只有u₁和u₃在

(ii) TB2 " X TB1 " X
（ii）TB“XTB”X

Proof
证明

_-a

ðiÞ Let x 2 T
ð Þ让x2T

_-a

# X then T ðxÞ X
#X然后TðxÞX;

8 *B₁

B₂
乙₂

B₁ B₁
乙₁乙₁

> h_ a i n o

* B B ) h_ a iðxÞ h_ a iðxÞ
*BB）haðxÞhaðxÞ

l_ ðx; yÞ a; 8b 2 B₁

>< TB ðxÞ ¼ y 2 U
><TBðxÞ1/4y2Uj

1 2 TB
1 2结核病2

h i
嗨I

TB
结核病1

h_ a i n o
哎呀

_-a

Thus; T
因此;吨

_-a

ðxÞ X ) x 2 T
ðxÞX）x2吨

# X:

TB
结核病2

ðxÞ ¼ y 2 Ujl_ ðx; yÞ a; 8b 2 B₂
ðxÞ1/4y2UlðxyÞa8b2B₂

R_b

B₂ B₂

>: h_ a i

h_ a
哈a i

_-a h_ a
一个HA i

Since; B B ) h_ a iðxÞ h_ a iðxÞ
由于BB）haðxÞhaðxÞ

Proof y 2 _ a # ðX^CÞ , h_ a iðyÞ X^C , h_ a iðyÞ\ X
证明y2a#ðX^CÞ ，haðyÞX^ChaðyÞX

¼ u , y 62 TB " ðXÞ
1/4uy62TB“ðXÞ

1 2

h_ a
哈a i

TB
结核病2

TB
结核病1

, y 2 _ a " ðXÞ
y 2 a “ ðXÞ

. Hence, TB # ðX^C
因此，TB#ðX^C

Þ¼ _ a " ðXÞ h
Þ1/4a“ðXÞh

Thus; T B1 ðxÞ\ X 6¼ u ) x 2 TB1 " X
ThusT BðxÞ\ X 61/4 u ） x 2 TB“ X:

_-a _{_}a
一个

Hence; TB2 " X TB1 " X
因此TB“XTB”X

Theorem 4.5 Let Y U be another set of objects, then following properties hold.
定理 4.5 设YU为另一组对象，则以下性质成立。

_-a _{_}a _{_}a
一个AA

Theorem 4.3 Let a₁ a₂, then
定理4.3设_{a 1}a₂然后

TB # ðX \ YÞ ¼ TB # ðXÞ \ TB # ðYÞ
TB#ðX\YÞ1/4TB#ðXÞ\TB#ÐYÞ

_-a _{_}a _{_}a
一个AA

- a1

- a2

TB " ðX [ YÞ ¼ TB " ðXÞ [ TB " ðYÞ
TB“ðX[YÞ1/4TB”ðXÞ[TB“ðYÞ

TB # X TB # X

(ii)
（二）

Proof
证明

TB " X TB " X
TB“XTB”X

z 2 _ a # ðX \ YÞ , h_ a iðzÞ X \ Y ,

Proof
证明

_a h _a
_一个H_A i

h_ a
哈a i

h_ a1 i n o
哎呀

, z 2 TB # ðXÞ and z 2 TB # ðYÞ , z 2 TB #
z2TB#ðXÞ和z2TB#ðYÞz2TB#

since TB
自TB

h_ a
哈a2 i

ðxÞ ¼ y 2 Ujl_ ðx; yÞ[ a₁; 8b 2 B
ðxÞ1/4y2UlðxyÞ[a₁8b2B

n o
不 o

_-a

ðXÞ\ TB # ðYÞ:
ðxþ\tb#ðyþ：

) h_ a2 iðxÞ h_ a1 iðxÞ ðSince; a a Þ
）haðxÞhaðxÞðSinceaaÞ

_a h _a
_一个H_A i

) h_ a2 iðxÞ X ð h_ a1 iðxÞ XÞ
）haðxÞXðhaðxÞXÞ

ð11Þ

h_ a i h_ a
哈哈 i

- a2

) x 2 # X
）x2 #X

h_ a
哈a i

- a1

Either TB
要么TB

_-a

ðzÞ\ X 6¼ u or TB ðzÞ\ Y 6¼ u
X 61/4 u 或 TB Y 61/4 u,

_-a

Hence, TB # X TB # X either z 2 T_B " ðXÞ or z 2 T_B " ðYÞ:
因此，TB# XTB#X要么z2T_B“ðXÞ要么z2T_B”ðYÞ：

- a2

Let y 2 " X, then
让y2“ X然后

h_ a2 i h_ a
哈哈1 i

h_ a2 iðxÞ\ X 6¼ u. Since,
haðxÞX61/4u自从，

_-a _{_}a
一个

Therefore, z 2 " ðXÞ [ " ðYÞ
因此，z2“ðxþ[”ðyþ

Hence, TB " ðX [ YÞ¼ TB " ðXÞ[ TB " ðYÞ
因此，TB“ðxyþ1/4TB”ðxþ[TB“ðyþ

) h_ a1 iðxÞ\ X 6¼ u ) y 2 _ a1 " X; h
）haðxÞX61/4u）y2a“Xh

- a2

Theorem 4.6
定理4.6

Hence, TB " X TB " X
因此，TB“XTB”X

_-a _{_}a _{_}a _{_}a
一个AAA

_-a

Theorem 4.4 X^C
定理4.4X^C

notes complement of set X
集合 X 的注释补充.

_ a

X C, where X^C de-
XC其中X^Cde-

TB # ðUÞ ¼ U ¼ TB " ðUÞ and TB # ðuÞ ¼ u ¼ TB " ðuÞ
TB#ðUÞ1/4U1/4TB“ðUÞ和TB#ðuÞ1/4u1/4TB”ðuÞ

Proof Easy to check. h
证明易于检查。h

Theorem 4.7
定理4.7

_-a h_ a
一个HA i

h_ a
哈a i

a h_ a
一个HA i

Proof
证明

a _{_}a
一个

Theorem 4.9
定理4.9

ðiÞ _ a # _ a # ðXÞ ¼ _ a # ðXÞ
ðþa#a#ðxþ1/4a#ðxþ

a # h_ a iðxÞ h_ a iðxÞ _ a " h_ a iðxÞ ðiiÞ _ a " _ a " ðXÞ ¼ _ a " ðXÞ
a#haðxÞhaðxÞa“haðxÞ ðiiÞa”a“ðXÞ1/4a”ðXÞ

hence TB TB
因此TBTB

TB TB TB
TBTBT B 公司

ð12Þ

TB TB TB
TBTBT B 公司

Now, we have to show that,
现在，我们必须证明，

_-a

Proof ðiÞ Since; T

# ðXÞ X; now; replacing X by
#ðXÞX;现在;将X替换为

h_ a i _ a h_ a
哈哈 i

h_ a i _ a a B
haab

h_ a
哈a i

a a a
一个AA

h_ a
哈a i

TB # TB # ðXÞ TB # ðXÞ ð18Þ

Rb Now; let y 2TB # ðXÞ; we have to show that y 2 TB #
Rb现在lety2TB#ðXÞ;我们必须证明y2TB#

h_ a i _ a
哈哈

If y 2 TB
如果y2TB

ðzÞ; then l_ ðz; yÞ a; 8b 2 B ð14Þ

R_b

TB # ðXÞ

h_ a
哈a i

nh_ a i

h_ a
哈a i

o h_ a i h
哦，你好 i

then; min TB
然后;最小TB

2 U;

ðx; zÞ; TB

ðz; yÞ

_-a

ðx; yÞ 8x; y; z

_-a

Let z 2 TB
让z2TB

ðyÞ; then l_ ðz; yÞ a; 8b 2 B ð20Þ

R_b

by transitivity of TB
通过TB 的传递性

ð15Þ

_- a

If u 2 T ðzÞ; this implies that l_ ðu; zÞ a; 8b 2 B ð21Þ
Ifu2TðzÞ;这意味着ðuzÞa8b2Bð21Þ

From (13), (14) and (15), we can conclude that
从（13），（14）和（15），我们可以得出结论

_-a

If TB is an equivalence relation, then from (20), (21) and
如果TB是等价关系，则from（20）、（21）和

lRb ðx; yÞ a; 8b 2 B then, y 2 h iðxÞ, hence, T ðzÞ
lRbðxyÞa8b2Bthen，y 2hðxÞhence，T ðzÞ

_-a _{_}a
一个

_-a

transitivity of TB, we get l_ ðu; yÞ a; 8b 2 B, it implies
TB的传递性我们得到lðu;yÞa;8b2B它意味着

h_ a
哈a i

_-a

ðxÞ, then z 2 T
ðxÞ，然后z2T

# h_ a i

ðxÞ

h_ a
哈a i

B B B
BB B

that u 2 TB
那个你2个TB

ðyÞ. From (19), we get u 2 X:Since u 2 TB
ðyÞFrom（19），我们得到 u2XSinceu2TB

h_ a iðxÞ _ a # h_ a iðxÞ : ð16Þ

ðzÞ and u 2
ðzÞ和u2

h_ a iðzÞ X; then z 2 _ a # ðXÞ:
haðzÞX thenz2a#ðXÞ：

h i
嗨I

h i h i
嗨 i 嗨

Since z 2 h_ a iðyÞ and z 2 _ a # ðXÞ. This gives that
由于z2haðyÞ和z2a#ðXÞ这给出了

_-a

Now, if z 2 TB "
现在，如果z2TB”

_-a

TB ðxÞ

_-a

, then TB
然后TB

_-a

ðzÞ\ TB

ðxÞ 6¼ u
ÐxÞ61/4u

TB TB

h_ a i _ a _{_}a
哈阿阿

then 9y 2 U such that y 2 h_ a iðzÞ\ h_ a iðxÞ, then y 2
然后9y2U使得y2haðzÞhaðxÞ然后y2

TB ðyÞ TB # ðXÞ; this implies that y 2 TB #
TBðyÞTB#ðXÞ;这意味着y2TB#

h i h i
嗨 i 嗨

TB TB

_ a

ðy; xÞ a; 8b 2 B:

R_b R_b

a _{_}a
一个

_ a

_-a h_ a
一个HA i

h_ a
哈a i

hence; TB # ðXÞ TB # TB # ðXÞ
因此TB#ðXÞTB#TB# ðXÞ

ð22Þ

therefore; TB " TB ðxÞ TB ðxÞ: ð17Þ
因此TB“TBðxÞTBðxÞ：ð17Þ

Hence, from (18) and (22), we get the required result
因此，from（18）和（22），我们得到所需的结果

_-a

Hence, from (12), (16) and (17), we get the required result. h
因此，from（12），（16）and（17），我们得到所需的结果。h

ðiiÞ Since; X TB " ðXÞ; then; replacing X by
ðiiÞ自;XTB“ðXÞ;然后;将X替换为

TB " ðXÞ; we get
TB“ðXþ;我们得到

_-a C
一个C

h_ a i C _ a _ a

_-a _ a
一个

ðfxgÞ ¼ h_ a iðxÞ
ðxgÞ1/4haðxÞ

a _ a
一个

h_ a
哈a i

Proof
证明

a C
一个C

h_ a i _C _ a

h_ a i _ a
哈哈

h_ a
哈a i

h_ a i _C

h_ a i _ a
哈哈

TB ðzÞ 6 fxg , z 62 TB

ðxÞ, z 2 TB

h_ a
哈a i

ðxÞ :

then z 2 TB
然后z 2TB

a; 8b 2
一个8b2

ðyÞ and z 2 TB " ðXÞ this gives that l_ ðz; yÞ
ðyÞ和z2TB“ðXÞ这样得到lðzyÞ

h_ a iðzÞ\ X 6¼ u; this implies that 9u
haðzÞX61/4u这意味着9u,

h_ a
哈a i

B and u 2 X: Since, l_ ðz; yÞ a; 8b 2 B and l
B和u2X自，lðzyÞa8b2B和l_

R_b R_b

ðu; zÞ a; 8b 2

_-a

B; hence, using transitivity of TB, we can
乙;因此，使用TB的传递性，我们可以

h_ a
哈a i

of dependency approach using above-defined lower approximation.
的依赖方法使用上述定义的较低近似值。

Since, u 2 h_ a iðyÞ and u 2
Since，u 2haðyÞ和u2

\X 6¼ u, this implies that
\X61/4u这意味着

_-a

X, this provides that TB ðyÞ
X这提供了TBðyÞ

Degree of dependency-based attribute selection
基于依赖关系的程度属性选择

_-a

y 2 " ðXÞ
y2 英寸 X英寸

a " _ a " ðXÞ _ a " ðXÞ ð24Þ
a“a”ðXÞa“ðXÞð24Þ

From (23) and (24), we get the result. h
From（23）和（24），我们得到结果。h

Theorem 4.10
定理4.10

a _{_}a _ a _{_}a
一个AAA

set of decision attributes D over set of conditional attributes
决策属性集D对条件属性集

B is defined as:
B定义为：

POS^a ðDÞ ¼ [ ð_ a # XÞ ð25Þ
POS^aðDÞ1/4[ða#XÞð25Þ

ðiiÞ _ a # ðXÞ _ a # _ a " ðXÞ _ a " ðXÞ
ðiiÞa#ðXÞa#a“ðXÞa”ðXÞ

where U=D = collection of classes having objects with
其中U=D=类的集合，这些类的对象的

Proof
证明

TB TB TB
TBTBT B 公司

same decision values.
相同的决策值。

Theorem 5.1 Let ðU; C [ D; V ; hÞ be a set-valued deci- sion system and X U;a 2 ð0; 1Þ: If B₁ B₂ C; then POS^a ðDÞ POS^a ðDÞ:
定理5.1设ðUCDVhÞ 为集值判定系统，XUa2ð01Þ：IfB₁B₂C则 POS^aðDÞPOS^aðDþ：

_-a B
一个 B1

(i) Since, X TB " ðXÞ, then, replacing X by
（i）由于，XTB“ðXÞ那么，将X替换为 X

B₂
乙₂

_-a _{_}a
一个

_-a

TB # ðXÞ, we get
TB#ðXÞ我们得到

Proof If B₁ B₂; we have TB
证明 B1B2 是否患有结核病1

# X TB2

# X; as proved
#X;事实证明

_-a _{_}a
一个

_ a

in Theorem 4.3(i), so that S ð
在定理 4.3（i）中，因此 S ð_

# XÞ S ð_ #
#XS - S#

_-a _{_}a
一个

TB ðyÞ\ TB # ðXÞ 6¼ u
TB ðyÞ\ TB # ðXÞ 61/4 u:

It implies that 9z 2 U, such that
这意味着9z2U使得

XÞ: Therefore,POS^a ðDÞ POS^a ðDÞ: h
XÞ：因此，POS^aðDÞPOS^aðDÞ：h

Theorem 5.2 Let ðU; C [ D; V ; hÞ be a set-valued deci- sion system and X U;a 2 ð0; 1Þ: If a₁ a₂; then
定理5.2设ðUCDVhÞ为集值判定系统，且XUa2ð01Þ：如果a₁a₂则

POS^a1 ðDÞ POS^a2 ðDÞ:
POS^aðDÞPOS^aðDÞ：

h_ a i _ a h_ a i B B

h_ a
哈a i

Proof If a₁ a₂; we have TB # X TB # X; as proved in
证明如果a₁a₂，我们有TB# XTB# X，如

h_ a
哈a i

X2U=D

provides that ^B ^B
规定^B^B

_-a _{_}a
一个

TB ðyÞ\ X 6¼ u; then y 2 TB " ðXÞ: Hence
TBðyÞ\X61/4utheny2TB“ðXÞ：因此;

Now, using the definition of positive region, we
现在，使用正区域的定义，我们

compute degree of dependency of decision attribute D
计算决策属性D 的依赖程度

_-a

(ii)
（二）

TB # ðXÞ

" ðXÞ:
“ðXÞ：

over set of conditional attributes B as:
over条件属性集B为：

POS^a D

a C_BðDÞ ¼ ^B
aC_BðDÞ1/4^B

ð26Þ

Since, TB # ðXÞ X, then replacing X by TB " ðXÞ
自从，TB # ðXÞ Xthen 将 X 替换为 TB “ ðXÞ,

we can conclude that
我们可以得出结论：

jUj

_-a _{_}a _{_}a
一个AA

TB # TB " ðXÞ TB " ðXÞ:
TB#TB“ðXÞTB”ðXÞ：

_-a

Now, let y 2 TB # ðXÞ; this implies that
现在，设y2TB#ðXÞ;这意味着

h_ a
哈a i

where j:j = cardinality of a set and C_BðDÞ 2 ½0; 1
其中 j：j = 集合的基数和 CBðDÞ 2 1/20;1

Theorem 5.3 (Monotonicity of C_BðDÞ) Suppose that B AT;fcg be an arbitrary conditional attribute that belong to the dataset and D be the set of decision attributes
定理5.3（C_BðDÞ 的单调性）假设BATcg是属于数据集的任意条件属性，D是决策属性集

_-a

Since, X " ðXÞ
自，X“ðXÞ

h_ a iðyÞ _ a "
haðyÞa”

and a 2 ð0; 1Þ, then C_B_[f_c_gðDÞ C_BðDÞ

a _ a
一个

a n

o _ a

_-a _{_}a
一个

_ a

u ; u jl u ; u a ) _ a _ a

Hence, TB # ðXÞ TB

# TB

" ðXÞ : h
“ðXÞ：h

i j _ i j RB[fcg

_-a _{_}a
一个

TB[fcg

_-a

TB: Therefore,
TB：所以，

In next section, we propose an attribute selection
在下一节中，我们提出了一个属性选择

½TB[fcg ðu_iÞ ½TB ðu_iÞ ) TB[fcg # X TB # X Since,
1/2TBc gðuÞ1/2TBðuÞ）TBcg#XTB#X自那时起，

method for set-valued information system based on degree
一种基于度的集合值信息系统方法

POS_BðDÞ ¼
POS_BðDÞ1/4

X2U=D

ðTb # XÞ h

By Eq. (2), POS_B_[_f_c_gðDÞ POS_BðDÞ
按方程式（2），POS_B_c_gðDþPOS_{B ð}DÞ

jPOS_BðDÞj

tional attributes. It selects those conditional attributes, which provide a maximum increase in the degree of
tional属性。它选择那些条件属性，这些属性提供

Now; since C_BðDÞ ¼
现在;自C_BðDÞ1/4

C_B_[_f_c_gðDÞ C_BðDÞ:
C_B_c_gðDÞC_BðDÞ：

; this implies that
;这意味着

jUj

dependency of decision attribute. The proposed algorithm is given as follows:
决策属性的依赖关系。所提算法如下：

A subset B of the conditional attribute set C is said to be a reduct of SVDS if
如果满足以下条件，则称条件属性集C 的子集B 是 SVDS 的还原

The main advantage of proposed algorithm is that it produces a close-to-minimal reduct set of a decision system without thoroughly checking all possible subsets of con-
所提出的算法的主要优点是，它产生了一个接近最小的决策系统归纳集，而无需彻底检查所有可能的子集。

C_BðDÞ ¼ C_CðDÞ
C_BðDÞ1/4C_CðDÞ

C_B_f_bi gðDÞ\C_BðDÞ; 8b_i 2 B
C_B_{b g}ðDÞ\C_BðDÞ8b2B

ð27Þ

ditional attributes.
ditional属性。

Now, we apply above proposed algorithm on some example datasets to demonstrate our approach.
现在，我们将上述提出的算法应用于一些示例数据集上，以演示我们的方法。

The selection of attributes in reduct set is achieved by comparing the degree of dependencies of decision attribute over sets of conditional attributes. Attributes are selected one by one until the reduct set provides the same quality of classifications as the original set.
通过比较决策属性与条件属性集的依赖程度，可以实现还原集中的属性选择。逐个选择属性，直到缩减集提供与原始集相同的分类质量。

An algorithm for tolerance rough set- based attribute selection of set-valued data
用于设置值数据的基于公差粗略集的属性选择的 n 算法

Illustrative examples
说明性示例

Example 7.1 Consider a set-valued decision system as given in Table 2. A fuzzy tolerance relation between objects u_i; u_j 2 U, calculated by using Eq. (3), is given in Table 5
例 7.1 考虑表 2 中给出的设定值决策系统。对象之间的模糊容差关系 u;u2 U，使用方程（3）计算，如表 5 所示.

We calculate the degree of dependency of decision attribute d over conditional attribute c₁ as follows, taking a ¼ 0:70;
我们按如下方式计算决策属性 d对条件属性 c₁ 的依赖程度，取 1/4 0：70;

Tolerance Classes:
公差等级：

In this section, a quick reduct algorithm for attribute _ a _ a
在本节中，属性aa 的快速归约算法

selection of set-valued information system is presented by using degree of dependency method based on tolerance
使用基于容差的依赖度法表示集合值信息系统的选择

½Tc1 ðu₁Þ ¼ fu₁; u₃; u₄g; ½Tc1 ðu₂Þ ¼ fu₂g
1/2Tcðu1Þ 1/4 u1u3u4g;1/2 Tcðu2Þ 1/4 u2g;

_-a

½Tc1 ðu₃Þ ¼ fu₁; u₃; u₄g;
1/2Tcð_{u 3}Þ1/4u₁u₃u₄g;

relation. Initially, the proposed algorithm starts with an _ a _{_}a
关系。最初，所提出的算法以aa 开头

empty set and add attributes one by one to calculate degree of dependencies of decision attribute over a set of condi-
empty set 和 addattributes 来计算决策属性对一组条件的依赖程度。

½Tc1 ðu₄Þ ¼ fu₁; u₃; u₄g; ½Tc1 ðu₅Þ ¼ fu₅g
1/2Tcð_{u 4}Þ1/4u₁u₃u₄g;1/2Tcðu₅Þ1/4u₅g

Table 5 Fuzzy tolerance relation
表5模糊容差关系

Rc1

u_i; u
uu u_j l_

Rc2

u_i; u
uu u_j l_

Rc3

u_i; u
uu u_j l_

Rc4

u_i; u
uu u_j

1	0.67	1	0.86	0.67	1	0	0.5	0.67	0.8	1	0.67	1	0.5	0.5	1	0.5	0.9	0.8	0.4
0.67	1	0.67	0.8	0.5	0	1	0.5	0.67	0.4	0.67	1	0.67	0.67	0.67	0.9	1	0.6	0.7	0.5
1	0.67	1	0.86	0.67	0.5	0.5	1	0.67	0.8	1	0.67	1	0.5	0.5	0.5	0.6	1	0.3	0.9
0.86	0.8	0.86	1	0.8	0.67	0.67	0.67	1	0.86	0.5	0.67	0.5	1	1	0.8	0.7	0.3	1	0.2
0.67	0.5	0.67	0.8	1	0.8	0.4	0.8	0.87	1	0.5	0.67	0.5	1	1	0.4	0.5	0.9	0.2	1

U/d = {d₁, d₂}

d₁= {u₁, u₂, u₄}, d₂= {u₃, u₅}

Lower approximation of U=d is calculated as:
U=d的下限近似值计算如下：

_-a _{_}a
一个

Tc1 # d₁ ¼ fu₂g; Tc1 # d₂ ¼ fu₅g
Tc#d₁1/4u₂克Tc#d₂1/4u₅克

So, positive region of d over c₁ is calculated as:
因此，d在c₁ 上的正区域计算为：

Since degree of dependency cannot exceed 1, fc₂; c₄g will be the reduct set of set-valued decision system as given in Table 2
由于依赖度不能超过 1， fc₂;c₄g 将是表 2 中给出的设定值决策系统的还原集.

Applying the method of Dai et al. (2013) on this example set-valued dataset, we get the same reduct set fc₂; c₄g; but, when we change the value of parameter a
将 Dai et al. （2013）的方法应用于此示例集值数据集，我们得到相同的还原集 fc₂;c₄克;但是，当我们更改参数A 的值时

from 0.7 to 0.9, our approach gives fc g as reduct set.
从0.7到0.9，我们的方法给出FcG作为还原集。

POS
POS机

c1 ðdÞ ¼
cðdÞ1/4

X2[U=d

ðTb #

XÞ ¼ ðTb #
XÞ1/4ÐTb#

d₁Þ[ ðTb #

d₂Þ

Therefore, proposed approach gives the facility to get the best minimal reduct for a set-valued decision system.
因此，所提出的方法为设定值决策系统提供了获得最佳最小还原的设施。

¼ fu₂g [ fu₅g ¼ fu₂; u₅g
1/4fu₂g[fu₅g1/4fu₂;你₅克

Now, degree of dependency of d over c₁ is calculated as:
现在，d对c₁的依赖度计算如下：

Example 7.2 After converting an incomplete decision system as given in Table 3 into a set-valued decision sys-
例7.2 将表3中给出的不完整决策系统转换为设定值决策系统后

C_f_c1 gðdÞ ¼
C_cgðdÞ1/4

jPOS_c1 ðdÞj jUj

¼ 5 ¼ 0:4
1/451/40：4

tem by replacing missing attribute values to set of all possible attribute values for any object, we get Table 4. Again, similar to Example 7.1, the fuzzy tolerance relation
tem 通过将缺失的属性值替换为任何对象的所有可能属性值的集合，我们得到表 4。同样，与例7.1 类似，模糊容差关系

Similarly, we can calculate degree of dependency of decision attribute with respect to other conditional attributes,
同样，我们可以计算 decision 属性相对于其他条件属性的依赖程度，

between objects u_i; u_j 2 U is given in Table 6
对象之间u;u2U在表6 中给出.

Taking,a ¼ 0:4;
拍摄，a1/40：4;

d₁ ¼ f u₁; u₂; u₅; u₆g; d₂ ¼ f u₃; u₄g
d₁1/4 fu₁;u₂;u₅;U₆克;d₂1/4fu₃;你₄克

3 1

C_f_c2 gðdÞ ¼ 5 ¼ 0:6; C_f_c3 gðdÞ ¼ 5 ¼ 0:2
C_cgðdÞ1/451/406C_{c g}ðdÞ1/451/402;

Calculating degree of dependency of decision attribute
计算决策属性的依赖程度

D over conditional attribute c₁
D超过条件属性c₁,

C_f_c4 gðdÞ ¼ 5 ¼ 0:6
C_cgðdÞ1/451/406;

since, C_f_c3 gðdÞ\C_f_c1 gðdÞ\C_f_c2 gðdÞ ¼ C_f_c4 gðdÞ
自从，C_{c g}ðdÞ\C_cgðdÞ\C_cgðdÞ1/4C_cgðdÞ

Either c₂ or c₄ will be the member of reduct set.
c₂或c₄将是reductset 的成员。

a _{_}a
一个

½Tc1 ðu₁Þ ¼ fu₁; u₃; u₄; u₅g; ½Tc1 ðu₂Þ ¼ fu₂; u₃; u₅; u₆g;

½Tc1 ðu₃Þ ¼ fu₁; u₂; u₃; u₄; u₅; u₆g;

_-a _{_}a
一个

Suppose,c₂ is first reduct member. We will add other
假设 c₂是第一个reduct成员。我们将添加其他

attributes to c₂ one by one and calculate corresponding degree of dependencies by using Eq. (8), we get
属性逐一传递给c₂，并使用方程（8）计算相应的依赖程度，我们得到

½Tc1 ðu₄Þ ¼ fu₁; u₃; u₄; u₅g; ½Tc1 ðu₅Þ ¼ fu₁; u₂; u₃; u₄; u₅; u₆g;
1/2Tcð_{u 4}Þ1/4u₁u₃u₄u₅g;1/2Tcðu₅Þ1/4u₁u₂_{u 3}u₄u₅u₆g;

½Tc1 ðu₆Þ ¼ fu₂; u₃; u₅; u₆g
1/2Tcð_{u 6}Þ1/4u₂u₃u₅u₆g

_-a _{_}a _{_}a _{_}a
一个AAA

Tfc1 ;c2 gðu₁Þ ¼ fu₁g; Tfc1 ;c2 gðu₂Þ ¼ fu₂g; Now, T_c1 # d₁ ¼ /; T_c1 # d₂ ¼ /
Tccgðu₁Þ1/4u₁gTccgðu₂Þ1/4u₂g现在，T_c#d₁1/4/T_c#d₂1/4/

a u u u
auuu u ;

1 2

a _{_}a
一个

Tfc1 ;c2 gðu4Þ ¼ u3;u4 ; Tfc1 ;c2 gðu5Þ ¼ fu5g
Tcc gðu4Þ1/4u3u4Tccgðu5Þ1/4u5g

a _{_}a
一个

Tfc1 ;c2 g # d₁ ¼ fu₁; u₂g; Tfc1 ;c2 g # d₂ ¼ fu₅g
Tcc g#d₁1/4u₁u₂g;Tcc g#d₂1/4u₅克

POS_c ðdÞ¼/ and C_f_c _gðdÞ ¼ 0
POS_cðdÞ1/4/和C_c_gðdÞ1/40

Similarly, for other conditional attributes,
同样，对于其他条件属性，

C_f_c2 gðdÞ ¼ 0; C_f_c3 gðdÞ ¼ 0; C_f_c4 gðdÞ ¼ 0:67
CcgðdÞ 1/4 0CcgðdÞ 1/4 0CcgðdÞ 1/4 067;

Since the degree of dependency is the highest for c₄; c₄

_-a _{_}a
一个

POS_f_c1 ;c2 gðdÞ ¼ Tfc1 ;c2 g # d₁ [ Tfc1 ;c2 g # d₂ ¼ fu₁; u₂; u₅g
POS_cc gðdÞ1/4Tccg#d₁Tccg#d₂1/4u₁u₂u₅g

C_f_c1 ;c2 gðdÞ ¼ 5 ¼ 0:6
C_cc gðdÞ1/451/406

Similarly, C_f_c _;_c _gðdÞ ¼ ³ ¼ 0:6; C_f_c _;_c _gðdÞ ¼ ⁵ ¼ 1
同样，C_c_c_gðdÞ1/4³1/406C_c_c_gðdÞ1/4⁵1/41

will be the first member of the reduct set. Similar to
将是reduct集的第一个成员。类似于

Example 7.1, on adding other attributes to c₄; we can calculate corresponding degree of dependencies as follows:
例7.1向c添加其他属性₄;我们可以计算出相应的依赖度如下：

C_f_c1 ;c4 gðdÞ ¼ 1; C_f_c2 ;c4 gðdÞ ¼ 1; C_f_c3 ;c4 gðdÞ ¼ 0:67
C_cc gðdÞ1/41C_ccgðdÞ1/41C_ccgðdÞ1/4067

2 3 5 2 4 5

Table 6 Fuzzy tolerance relation
表6模糊容差关系

Rb1

u_i; u
uu u_j l_

Rb1

u_i; u
uu u_j l_

Rc3

u_i; u
uu u_j l_

Rb4

u_i; u
uu u_j

1	0	0.67	1	0.67	0	1	0	0.5	0	0.5	1	1	0.4	0.4	0	0	0	1	0	0	0	0	1
0	1	0.67	0	0.67	1	0	1	0.5	0	0.5	0	0.4	1	1	0.4	0.4	0.4	0	1	0	1	0	0
0.67	0.67	1	0.67	1	0.67	0.5	0.5	1	0.5	1	0.5	0.4	1	1	0.4	0.4	0.4	0	0	1	0	0	0
1	0	0.67	1	0.67	0	0	0	0.5	1	0.5	0	0	0.4	0.4	1	0	0	0	1	0	1	0	0
0.67	0.67	1	0.67	1	0.67	0.5	0.5	1	0.5	1	0.5	0	0.4	0.4	0	1	0	0	0	0	0	1	0
0	1	0.67	0	0.67	1	1	0	0.5	0	0.5	1	0	0.4	0.4	0	0	1	1	0	0	0	0	1

Table 7 Effect of a on reduct set
表7a对reductset 的影响

Values of a Reducts
AReducts 的值

a ¼ 0:4 fc₁; c₄g or fc₂; c₄g
A1/404c₁c₄g或c₂c₄g

a ¼ 0:5 fc₃; c₄g
A1/405C₃C₄克

a ¼ 0:6 fc₂; c₄gor fc₃; c₄g
A1/406c₂c₄g或c₃c₄g

a ¼ 0:7 fc₂; c₃gorfc₂; c₄g
A1/407C₂C₃G或C₂C₄G

Hence, fc₁; c₄g or fc₂; c₄g will be the reduct set of incomplete decision system as given in Table 3
因此，fc₁;c₄g 或 fc₂;c₄g 将是表 3 中给出的不完整决策系统的还原集 .

Here,a is user-oriented, so we can find the best minimal reduct by changing the value of a as follows (Table 7):
这里，a是面向用户的，因此我们可以通过更改 a 的值来找到最佳的最小 reduct ，如下所示（表 7）：

So, expert can decide the value of a according to domain in order to find the best suitable reduct set of a decision system with missing values.
因此，专家可以根据域确定a的值，以便找到具有缺失值的决策系统的最合适的归减集。

Example 7.3 Let us consider a practical situation from the foreign language ability test in Shanxi University, China. Results can be inferred as a conjunctive set-valued infor- mation system. We have classified the whole test into four factors: Audition, Spoken language, Reading and Writing. The test results are given in Table 8, which can be down- loaded from (http://www.yuhuaqian.com), where U ¼ fu₁; u₂; u₃; .. .; u₄₉; u₅₀g: For convenience purpose, we have used abbreviations for Audition, Spoken language, Reading and Writing as A, S, R and W, respectively.
例7.3 让我们考虑中国山西大学外语能力测试的实际情况。结果可以推断为一个联合的 set-value 信息系统。我们将整个测试分为四个因素：试听、口语、阅读和写作。测试结果见表 8，可以从（http://www.yuhuaqian.com）下载，其中U 1/4 fu₁;u₂;u₃;..。;u₄₉;u₅₀g：为方便起见，我们使用了Audition、Spokenlanguage、Reading 和 Writing 的缩写分别为 A、S、R 和 W。

We have applied the same process as described in Example 7.1 and calculate reduct for this dataset by taking a ¼ 0:67 as follows:
我们应用了与例 7.1 中描述的相同的过程，并通过取 1/4 0：67 来计算此数据集的 reduct ，如下所示：

So, reduct of the decision system is either fc₂; c₃; c₄g or
因此，决策系统的还原是fc₂;c₃;c₄g或

fc₁; c₂; c₄g.

So far, we have performed the experimental analysis for attribute selection of set-valued information system by applying proposed rough set-based approach in Example
到目前为止，我们已经通过应用示例中提出的基于粗糙集合的方法对集合值信息系统的属性选择进行了实验分析

7.1. We have compared proposed approach with an exist- ing approach to find the close-to-minimal reduct set by changing the value of the parameter a. In Example 7.2, we have dealt the problem of attribute selection in an incom- plete information system through conversion into set-val- ued information system by replacing missing attribute values for an object with the set of all possible attribute values. Also the effect of parameter a has been shown to find a minimal reduct set, which depend on users’ choice. In Example 7.3, we have successfully applied our approach in a practical situation obtained from foreign language ability test in Shanxi University, China.
7.1.我们将提出的方法与现有方法进行了比较，以通过更改参数 a 的值来找到接近最小还原集。在例 7.2 中，我们通过将对象缺失的属性值替换为所有可能的属性值的集合，通过转换为集合值信息系统，解决了不完整信息系统中的属性选择问题。此外，参数 a 的效果已被证明可以找到最小 reduct 集，这取决于用户的选择。在例7.3 中，我们成功地将我们的方法应用于中国山西大学外语能力测试获得的实际情况。

Experimental results and analysis
实验结果和分析

To check the efficiency of the proposed attribute reduction algorithm for set-valued information systems, we perform some experiments on a PC with specifications given in Table 9. We conduct our experiments on six real datasets taken from the University of California, Irvine (UCI) Machine Learning Repository in (Blake 1998). All six real datasets are incomplete decision systems (special case of set-valued decision system) given in Table 10. For the
为了检查所提出的属性约简算法对集合值信息系统的效率，我们在 PC 上进行了一些实验，其规格如表 9 所示。我们在（Blake 1998）的加利福尼亚大学欧文分校（UCI）机器学习存储库中获取的六个真实数据集上进行了实验。所有 6 个真实数据集都是不完全决策系统（设定值决策系统的特殊情况），如表10所示

2 9 experimental work, we use the WEKA tool with ten fold
29实验工作，我们使用WEKA工具十倍

C_f_c1 gðDÞ ¼ 0; C_f_c2 gðDÞ ¼ 50 ; C_f_c3 gðDÞ ¼ 50
C_cgðDÞ1/40C_cgðDÞ1/450C_cgðDÞ1/450 ;

C_f_c4 gðDÞ ¼ 50
C_cgðDÞ1/450

32 43
3243 元

cross-validation technique (Hall et al. 2009). In the
交叉验证技术（Hallet al.2009）。在

experiments, we select three attribute reduction algorithms for comparisons. There are two statistical approaches
实验中，我们选择了三种属性缩减算法进行比较。有两种统计方法

31 Relief-F (Robnik-Sˇikonja and Kononenko 2003) and cor-
31浮雕-F（Robnik-S ˇikonja和Kononenko2003）和cor-

C_f_c1 ;c4 gðDÞ ¼ 50 ; C_f_c2 ;c4 gðDÞ ¼ 50 ; C_f_c3 ;c4 gðDÞ ¼ 50
C_cc gðDÞ1/450C_ccgðDÞ1/450C_ccgðDÞ1/450 ;

48 48
4848 元

Cfc₁;c₂;c₄gðDÞ ¼ 50 ; Cfc₂;c₃;c₄gðDÞ ¼ 50
Cc₁c₂c₄gðDÞ1/450Cc₂c₃c₄gðDÞ1/450 ;

Cfc₁;c₂;c₃;c₄gðDÞ ¼ 50
Cc₁c₂c₃c₄gðDÞ1/450 :

relation-based feature selection (Hall 1999), and one fuzzy rough set model (Dai and Tian 2013). For convenience, we denote them as Relief-F, CFS and FRSM, respectively. For calculation of classification accuracies, we use two classi- fiers, namely PART and J48. Finally, a paired t test is performed to ensure the significance of experimental
基于关系的特征选择（Hall 1999）和一个模糊粗糙集模型（Dai 和 Tian 2013）。为方便起见，我们分别将它们表示为Relief-F、CFS和FRSM。为了计算分类精度，我们使用两个分类器，即 PART 和 J48。最后，进行配对 t 检验以确保实验的显著性

Table 8 A set-valued decision table obtained from foreign language ability test in Shanxi University
表8山西大学外语能力测验得到的定值决策表

Table 9 The description of experiment environment
表9实验环境说明

No. Names Model Parameters
不。名称模型参数

CPU Intel(R) Core(TM) i5-4210U 1.70 GHz, 2 Cores
CPU Intel（R）Core（TM）i5-4210U 1.70GHz，2 核

Memory DDR3 SDRAM 8 GB 2401 MHz
内存 DDR3SDRAM 8GB2401MHz

Hard disk ST1000LM024 1 TB
硬盘ST1000LM024 1TB

System Windows 10 64-bit
系统 Windows1064 位

Platform Python 2.7 Anaconda distribution for windows
适用于Windows 的平台 Python2.7 Anaconda发行版

Table 10 The description of datasets
表10 数据集说明

No. 不。	Datasets 数据	Abbreviation 缩写	Objects 对象	Features 特征	Classes 类
1.	Audiology_Standardized	Audiology 听觉学	226	69	24
2.	Soyabean_Large	Soyabean 大豆	307	35	19

Dermatology Dermatology 366 34 6
皮肤病学皮肤病学366346

Hepatitis Hepatitis 115 19 2
肝炎肝炎115192

Zoo Zoo 101 17 7
动物园动物园 101177

Processed_Cleveland Cleveland 303 14 5
Processed_Cleveland克利夫兰303145

results obtained from the proposed approach, where the significance level is specified to be 0.05.
从所提出的方法获得的结果，其中显着性水平指定为 0.05。

Size and the classification accuracy of the reduced fea- ture subset obtained by FSRS are statistically compared with those acquired by FRSM, Relief-F and CFS by using the paired t test. The obtained results are listed in the last columns of Tables 11, 12 and 13. In tables, ‘‘w’’ denotes the number of win, ‘‘*’’ denotes the number of tie and ‘‘l’’ denotes the number of loss achieved by the proposed FSRS approach, which is also written next to the values in Table 11, 12 and 13. Here, win means that the cardinality (or accuracy) of the feature subset obtained by FSRS is
使用配对 t 检验将 FSRS 获得的简化特征子集的大小和分类精度与 FRSM、Relief-F 和 CFS 获得的子集的大小和分类精度进行统计比较。获得的结果列在表 11、12和 13 的最后一列中。在表格中，''w'' 表示获胜次数，''*''表示平局次数，''l'' 表示通过提议的FSRS 方法实现的失败次数，它也写在表 11、12和 13 中的值。这里，win 表示FSRS获取的特征子集的基数（或准确率）为

significantly fewer (or higher) than that of FRSM, Relief-F or CFS; tie means that the results obtained by FSRS have no statistical difference with that of FRSM, Relief-F or CFS; and loss means that the proposed approach is statis- tically poor than other approaches.
明显小于（或高于 FRSM、Relief-F 或 CFS）;tities 意味着 FSRS 获得的结果与 FRSM、Relief-F 或 CFS 的结果没有统计差异;loss 意味着所提出的方法在统计上比其他方法差。

Reduct size
Reductsize （还原大小）

After comparing proposed approach with other three approaches on chosen datasets, reduced average (avg.) feature subset size is given in Table 11. Effect of parameter a on reduct size is also shown for the proposed approach.
在所选数据集上将所提出的方法与其他三种方法进行比较后，表11中给出了减小的平均（avg.）特征子集大小。对于所提出的方法，还显示了参数 a对还原大小的影响。

Table 11 Comparison of feature subset size (Avg. subset size)
表 11特征子集大小（Avg. 子集大小）

Table 12 Comparison of classification accuracies (rules-PART)
表12分类精度比较（规则-PART）

Datasets 数据	Original 源语言	FSRS				FRSM	Relief-F 浮雕-F	CFS	Paired t test (w//l) 配对t检验（w//l）
		a = 0.1	a = 0.3	a = 0.5	a = 0.7
Audiology 听觉学	78.31	77.87	78.31	75.66	75.66	76.12*	80.08*	77.43*	(0/3/0)
Soyabean 大豆	91.94	0	0	83.74	88.43	87.70*	87.99 l	85.06*	(0/2/1)
Dermatology 皮肤病学	94.53	91.26	91.26	91.26	91.26	90.25w 90.25 瓦	91.53w 91.53 瓦	90.71w 90.71 瓦	(3/0/0)
Hepatitis 肝炎	67.74	83.22	83.22	83.22	83.22	76.13w 76.13 瓦	84.51*	81.93*	(1/2/0)
Zoo 动物园	92.07	95.04	95.04	95.04	95.04	85.14w 85.14 瓦	95.04*	95.04*	(1/2/0)
Cleveland 克利夫兰	53.79	54.12	54.12	54.12	54.12	49.17w 49.17 瓦	57.75 l	54.78*	(1/1/1)

Table 13 Comparison of classification accuracies (trees-J48)
表13分类精度比较（树-J48）

Datasets 数据	Original 源语言	FSRS				FRSM	Relief-F 浮雕-F	CFS	Paired t test(w//l) 配对t检验（w//l）
		a = 0.1	a = 0.3	a = 0.5	a = 0.7
Audiology 听觉学	77.87	78.31	78.76	77.87	77.87	76.12*	78.76*	77.87*	(0/3/0)
Soyabean 大豆	91.50	0	0	86.23	87.99	87.99*	87.84 l	85.36*	(0/2/1)
Dermatology 皮肤病学	93.98	92.62	92.62	92.62	92.62	88.28*	90.44*	87.70*	(0/3/0)
Hepatitis 肝炎	58.06	83.22	83.22	83.22	83.22	78.06w 78.06 瓦	84.51*	81.29*	(1/2/0)
Zoo 动物园	92.07	94.05	94.05	94.05	94.05	89.10w 89.10 瓦	95.04*	95.04*	(1/2/0)
Cleveland 克利夫兰	52.14	54.12	54.12	54.12	54.12	54.12*	56.10*	54.12*	(0/3/0)

Results obtained from Table 11 indicate that all four fea- ture selection algorithms exclude most of the features available in unreduced datasets. But it can be observed that the proposed approach provides more reduced or equal reduct size as compared to other three approaches. As for the hepatitis dataset having 19 attributes, proposed approach (FSRS) selects 3 (nearest integer is taken) attri- butes while FRSM, Relief-F and CFS select 6, 5 and 10 attributes, respectively. It shows that FSRS has a redun- dancy-removing capacity, while other algorithms do not completely eradicate the redundant features from the selected feature subset.
从表 11 获得的结果表明，所有四种特征选择算法都排除了未简化数据集中可用的大多数特征。但可以观察到，与其他三种方法相比，所提出的方法提供了更多更小或相等的还原大小。对于具有 19 个属性的肝炎数据集，建议的方法（FSRS）选择 3 个（取最接近的整数）属性，而 FRSM、Relief-F 和 CFS 分别选择 6、5 和 10 个属性。它表明 FSRS 具有冗余去除能力，而其他算法并不能完全消除所选特征子集中的冗余特征。

Effect of parameter a The parameter a need to be set individually according to different datasets because of their different correlation strengths. In Table 11, selected feature subset for soyabean dataset is varying with the change in parameter a. At a = 0 and a = 0.3, FSRS does not provide
参数 a 的影响由于它们的相关性强度不同，需要根据不同的数据集单独设置。在表 11 中，大豆数据集的选定特征子集随参数 a 的变化而变化，在a=0和a=0.3 时，FSRS不提供

any reduct elements, but as we increase the threshold parameter a, FSRS outperforms other approaches. For example, at a = 0.5 and a = 0.7, FSRS select 9 and 12 attributes, respectively, while FRSM, Relief-F and CFS select 16, 17 and 17 attributes, respectively. Also, for audiology data, subset size is affected by parametric value a as at a = 0.1, a = 0.3 and a = 0.5, FRSR gives reduct sizes 12, 12 and 10, respectively, but at a = 0.7, reduct size is same as at a = 0.5. Remaining 4 datasets show no variation for the value of parameter a as they provide the same number of selected attributes for all chosen values of a
任何还原元素，但随着阈值参数 a 的增加，FSRS 的性能优于其他方法。例如，在 a = 0.5 和 a = 0.7 时，FSRS 分别选择 9 和 12 个属性，而 FRSM、Relief-F 和 CFS 分别选择 16、17 和 17 个属性。此外，对于听力学数据，子集大小受参数值 a 的影响，如 a = 0.1、a = 0.3 和 a = 0.5，FRSR 分别给出还原大小 12、12 和 10，但在 a = 0.7 时，还原大小与 a = 0.5 时相同。其余 4 个数据集显示参数 a 的值没有变化，因为它们为.

Statistical analysis It can be seen clearly from the results of t test which is applied between reduct sizes of FSRS (at a = 0.5), FRSM, Relief-F and CFS algorithms (presented in Table 11) that, for almost all the datasets, FSRS out- performs the other three reduction algorithms in terms of
统计分析从 FSRS（a = 0.5）、FRSM、Relief-F 和 CFS 算法（如表 11 所示）的还原大小之间应用的 t 检验结果中可以清楚地看出，对于几乎所有数据集，FSRS 的性能都优于其他三个缩减算法

Fig. 1 Variation of reduced feature subset sizes with four algorithms
图 14 种算法下缩减特征子集大小的变化

Fig. 2 Variation of classification accuracies with classifier PART for four algorithms
图 2 四种算法使用分类器 PART 的分类精度变化

cardinality of the feature subset. In Table 11, FSRS achieves significantly fewer features for all the datasets except the dataset Cleveland and zoo. For Clevland dataset, FSRS is significantly equivalent to Relief-F and CFS algorithms in terms of subset size. In summary, out of total 18 paired t test performance results it gets 15 wins, 3 ties and 0 loss.
特征子集的基数。在表 11 中，FSRS 在除数据集Cleveland和zoo 之外的所有数据集上实现的特征都明显较少。对于Clevland数据集，FSRS 在子集大小方面与 Relief-F 和 CFS 算法明显等效。总之，在总共 18 对 t 测试性能结果中，它得到 15 胜、3 平和 0 负。

Classification accuracy
分类准确性

Comparison of classification accuracies for classifiers PART and J48 is presented in Tables 12 and 13, respec- tively. The classification accuracies are presented in per- centage. In Table 12, for soyabean, dermatology and zoo datasets, classification accuracies evaluated by FSRS algorithm are higher or equal as compared to rest three algorithms while for other datasets, it shows mixed beha- viour. For example, classification accuracy for Cleveland
表 12和表 13 分别列出了分类器 PART 和 J48 的分类精度比较。分类精度以百分比表示。在表 12 中，对于大豆、皮肤病学和动物园数据集，FSRS 算法评估的分类准确性高于或等于其余三种算法，而对于其他数据集，则表现出混合行为。例如，Cleveland 的分类准确性

dataset is 54.12 in case of FSRS approach and 49.17, 57.75 and 54.78 in case of FRSM, Relief-F and CFS approaches, respectively. Similarly, for hepatitis dataset, proposed algorithm provides better classification accuracy as FRSM and CFS, but less than the Relief-F algorithm. In Table 13, we can find similar kind of results as in Table 12
在FSRS方法的情况下，数据集分别为54.12，在 FRSM、Relief-F 和 CFS 方法的情况下，数据集分别为 49.17、57.75 和 54.78。同样，对于肝炎数据集，所提出的算法提供了更好的分类精度，如 FRSM 和 CFS，但低于 Relief-F 算法。在表 13 中，我们可以找到与表 12 中类似的结果.

Effect of parameter a As we change the value of a, cor- responding classification accuracies are also changing for datasets soyabean and audiology but not for other datasets. In Table 12, for a = 0 and a = 0.3, classification accuracies in case of soyabean dataset are 0 but for a = 0.5 and a = 0.7, classification accuracies are 83.74 and 88.43, respectively. So, by changing the value of parameter a, we can get better classification accuracies than rest three approaches. There is no effect of parameter on rest four datasets.
参数 a 的影响当我们改变 a 的值时，数据集大豆和听力学的相应分类准确性也在变化，但其他数据集则没有变化。在表12 中，对于 a=0 和 a=0.3，大豆数据集的分类准确率为 0，但当 a= 0.5 和 a= 0.7 时，分类准确率分别为 83.74和88.43。因此，通过更改参数 a 的值，我们可以获得比其余三种方法更好的分类精度。parameter 对其余 4 个数据集没有影响。

Statistical analysis Paired t test is applied between clas- sification accuracies of FSRS (at a = 0.5), FRSM, Relief-F and CFS approaches, and results are shown in the last column
统计分析在 FSRS （a= 0.5）、FRSM、Relief-F 和CFS方法的分类精度之间应用配对 t 检验，结果显示在最后一列

Fig. 3 Variation of classification accuracies with classifier J48 for four algorithms
图 3四种算法使用分类器 J48的分类精度变化

of Tables 12 and 13. In Table 12, FSRS achieves signifi- cantly higher or equivalent accuracy for all datasets except the datasets soyabean and Cleveland where it loses to Relief- F approach. In summary, out of total 18 paired t test per- formance results FRSR approach gets 6 wins, 10 ties and 2 losses. Similarly, in Table 13, out of total 18 paired t test performance results it gets 2 wins, 16 ties and 1 loss. Therefore, the proposed FSRS approach is effective and way better than other approaches in terms of both acquiring few features and achieving high classification accuracy.
表 12和 13。在表 12 中，FSRS 在所有数据集上都实现了明显更高或相当的准确性，但数据集 soyabean和Cleveland 除外，它输给了Relief- F 方法。总之，在总共 18 个配对 t 检验性能结果中，FRSR 方法获得 6 胜、10 平和 2 负。同样，在表 13 中，在总共 18 个配对 t 测试性能结果中，它得到 2 胜、16 平和 1 负。因此，所提出的FSRS方法在获取较少特征和实现高分类精度方面是有效的，并且比其他方法更好。

More detailed change trendline of each approach on the six datasets is displayed in Figs. 1, 2 and 3. Figure 1 rep- resents a comparison of reduced average feature subset size for all four algorithms on six datasets. It can be observed that the proposed FSRS algorithm selects least number of features as a member of reduct set. Figures 2 and 3 display more detailed change trend of the algorithms in classifi- cation accuracy with the number of selected attributes on all chosen dataset. It is obvious from figures that the pro- posed approach provides either higher or nearly equal classification accuracy for all six datasets.
图 1、2和 3。图 1反映了在 6 个数据集上所有四种算法的平均特征子集大小的减少比较。可以观察到，所提出的 FSRS 算法选择的特征数量最少作为 reduct 集的成员。图 2和图 3显示了算法在分类准确性方面更详细的变化趋势，以及所有选定数据集上所选属性的数量。从数字中可以明显看出，所提出的方法为所有六个数据集提供了更高或几乎相等的分类精度。

After summarizing the comparison tables and graphs above, we can finally conclude that the proposed FSRS algorithm is an acceptable choice to select the best feature subsets in set-valued decision systems.
在总结了上面的比较表和图表之后，我们最终可以得出结论，所提出的 FSRS 算法是在设定值决策系统中选择最佳特征子集的可接受选择。

Conclusion and future work
结论和未来工作

In this paper, we have defined a tolerance relation for set- valued decision systems and given a novel approach for attribute selection based on the rough set concept using a similarity threshold. Lower and upper approximations have been defined by using fuzzy tolerance relation and
在本文中，我们定义了一个集合值决策系统的容忍关系，并给出了一种基于粗略集概念的新方法，使用相似性阈值进行属性选择。下限和上限近似值已使用模糊容差关系和

presented a method to calculate degree of dependency of decision attribute over a subset of conditional attributes. Some important results on lower and upper approxima- tions, positive regions and the degree of dependencies have been validated using our approach. Moreover, we have presented an algorithm along with some illustrative examples for better understanding of the proposed approach. In Example 7.2, we have applied our method to an incomplete information system, in which some attribute values were missing. Effect of parameter a on reduct set of set-valued decision systems has been shown. We have compared the proposed approach with three existing approaches on a six real benchmark datasets and observed that our model is able to find the minimal reduct with higher accuracy. We have also ensured that the proposed approach is statistically more significant in comparison with the other approaches by using paired t test technique. In the future, we will investigate some robust models for set-valued information system to avoid misclassification and noise. Set-valued information systems with missing decision values will be taken into consideration from the viewpoint of updating the process of knowledge discovery. We intend to find some generalizations of fuzzy rough set-
提出了一种方法来计算决策属性对条件属性子集的依赖程度。使用我们的方法验证了关于下部和上部近似、正区域和依赖程度的一些重要结果。此外，我们还提出了一种算法以及一些说明性示例，以便更好地理解所提出的方法。在例 7.2 中，我们将方法应用于一个不完整的信息系统，其中缺少一些属性值。已经显示了参数 a对集合值决策系统的还原集的影响。我们在六个真实的基准数据集上将所提出的方法与三种现有方法进行了比较，并观察到我们的模型能够以更高的准确性找到最小还原。我们还通过使用配对 t 检验技术确保所提出的方法与其他方法相比在统计学上更显着。未来，我们将研究一些用于集合值信息系统的鲁棒模型，以避免误分类和噪声。从更新知识发现过程的角度来看，将考虑缺少决策值的集合值信息系统。我们打算找到一些模糊粗略集的泛化-

based attribute selection for set-valued decision systems.
基于设置值决策系统的属性选择。

Compliance with ethical standards
遵守道德标准

Conflict of interest The authors declare that they have no conflict of interest.
利益冲突作者声明他们没有利益冲突。

Research involving human participants and/or animals This article does not contain any studies with human participants or animals performed by any of the authors.
涉及人类参与者和/或动物的研究本文不包含任何作者对人类参与者或动物进行的任何研究。

References
引用

Blake CL (1998) UCI Repository of machine learning databases, Irvine, University of California. http://www.ics.uci.edu/
BlakeCL（1998）UCI机器学习数据库存储库，尔湾，加利福尼亚大学。http://www.ics.uci.edu/

*mlearn/MLRepository.html. Accessed 1 Feb 2019
*mlearn/MLRepository.htm访问日期：2019 年 2 月1 日

Dai J (2013) Rough set approach to incomplete numerical data. Inf
DaiJ（2013）不完全数值数据的粗糙集方法.Inf

Sci 241:43–57
科学241：43-57

Dai J, Tian H (2013) Fuzzy rough set model for set-valued data.
DaiJ，Tian H（2013）用于集值数据的模糊粗糙集模型.

Fuzzy Sets Syst 229:54–68
模糊集系统229：54-68

Dai J, Xu Q (2012) Approximations and uncertainty measures in incomplete information systems. Inf Sci 198:62–80
Dai J， Xu Q （2012）不完整信息系统中的近似值和不确定性测量。国际科学 198：62-80

Dai J, Wang W, Tian H, Liu L (2013) Attribute selection based on a new conditional entropy for incomplete decision systems. Knowl-Based Syst 39:207–213
Dai J， Wang W， Tian H， Liu L （2013）基于不完整决策系统的新条件熵的属性选择。基于知识的系统 39：207-213

Dubois D, Prade H (1992) Putting rough sets and fuzzy sets together. In: Słowin´ski R (ed) Intelligent decision support. Springer, Dordrecht, pp 203–232
Dubois D， Prade H （1992）将粗糙集和模糊集放在一起。 In： Słowinski R （ed） Intelligent decision support. 施普林格，多德雷赫特，第 203-232 页

Guan YY, Wang HK (2006) Set-valued information systems. Inf Sci 176(17):2507–2525
Guan YY， Wang HK （2006）集值信息系统。国际科学 176（17）：2507–2525

Hall M (1999) Correlation-based feature selection for machine learning. PhD Thesis, Department of Computer Science, Waikato University, New Zealand
Hall M （1999）用于机器学习的基于相关性的特征选择。博士论文，新西兰怀卡托大学计算机科学系

Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Hall M， Frank E， Holmes G， Pfahringer B， Reutemann P， Witten IH （2009） WEKA 数据挖掘软件：更新。ACM SIGKDD 探索新闻 11（1）：10-18

He Y, Naughton JF (2009) Anonymization of set-valued data via top- down, local generalization. Proc VLDB Endow 2(1):934–945

Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594

Huang SY (ed) (1992) Intelligent decision support: handbook of applications and advances of the rough sets theory, vol 11. Springer, Berlin

Jensen R, Shen Q (2009) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838

Jensen R, Cornelis C, Shen Q. (2009) Hybrid fuzzy-rough rule induction and feature selection. In: FUZZ-IEEE 2009, IEEE international conference on fuzzy systems, 2009. IEEE,

pp. 1151–1156

Kryszkiewicz M (1998) Rough set approach to incomplete informa- tion systems. Inf Sci 112(1–4):39–49

Kryszkiewicz M (1999) Rules in incomplete information systems. Inf Sci 113(3–4):271–292

Lang G, Li Q, Yang T (2014) An incremental approach to attribute reduction of dynamic set-valued information systems. Int J Mach Learn Cybern 5(5):775–788
Lang G， Li Q， Yang T （2014）动态集值信息系统属性缩减的增量方法。国际马赫学习网络杂志 5（5）：775–788

Leung Y, Li D (2003) Maximal consistent block technique for rule acquisition in incomplete information systems. Inf Sci 153:85–106
Leung Y， Li D （2003）用于不完整信息系统中规则获取的最大一致块技术。国际科学 153：85-106

Lipski W Jr (1979) On semantic issues connected with incomplete information databases. ACM Trans Database Syst (TODS) 4(3):262–296
Lipski W Jr （1979）关于与不完整信息数据库相关的语义问题。ACM Trans 数据库系统（TODS） 4（3）：262–296

Lipski W Jr (1981) On databases with incomplete information.
LipskiWJr（1981）关于信息不完整的数据库。

J ACM (JACM) 28(1):41–70
美国医学杂志（JACM）28（1）：41-70

Luo C, Li T, Chen H, Liu D (2013) Incremental approaches for updating approximations in set-valued ordered information systems. Knowl-Based Syst 50:218–233
Luo C， Li T， Chen H， Liu D （2013）更新集合值有序信息系统中近似值的增量方法。基于知识的系统 50：218-233

Luo C, Li T, Chen H (2014) Dynamic maintenance of approximations in set-valued ordered decision systems under the attribute generalization. Inf Sci 257:210–228
Luo C，Li T，Chen H（2014）属性泛化下集合值有序决策系统中近似的动态维护.国际科学 257：210-228

Luo C, Li T, Chen H, Lu L (2015) Fast algorithms for computing rough approximations in set-valued decision systems while updating criteria values. Inf Sci 299:221–242
Luo C， Li T， Chen H， Lu L （2015）在更新标准值时计算设定值决策系统中粗略近似的快速算法。国际科学 299：221–242

Orłowska E (1985) Logic of nondeterministic information. Stud Logica 44(1):91–100
Orłowska E （1985）非确定性信息的逻辑. 螺柱 44（1）：91-100

Orłowska E, Pawlak Z (1984) Representation of nondeterministic information. Theor Comput Sci 29(1–2):27–39
Orłowska E， Pawlak Z （1984）非确定性信息的表示。理论计算科学 29（1-2）：27-39

Pawlak Z (1991) Rough Sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht
Pawlak Z （1991）粗糙集：关于数据推理的理论方面。Kluwer Academic Publishers，多德雷赫特

Pawlak Z, Skowron A (2007a) Rough sets and Boolean reasoning. Inf Sci 177(1):41–73
Pawlak Z，Skowron A（2007a）粗糙集和布尔推理。国际科学177（1）：41-73

Pawlak Z, Skowron A (2007b) Rough sets: some extensions. Inf Sci 177(1):28–40
Pawlak Z， Skowron A （2007b）粗糙的集：一些扩展。国际科学 177（1）：28-40

Pawlak Z, Skowron A (2007c) Rudiments of rough sets. Inf Sci 177(1):3–27
Pawlak Z， Skowron A （2007c）粗糙集的雏形。国际科学177（1）：3-27

Qian Y, Dang C, Liang J, Tang D (2009) Set-valued ordered information systems. Inf Sci 179(16):2809–2832
Qian Y， Dang C， Liang J， Tang D （2009）集值有序信息系统。国际科学 179（16）：2809–2832

Qian Y, Liang J, Pedrycz W, Dang C (2010a) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9–10):597–618
QianY，Liang J，PedryczW，Dang C（2010a）正近似：粗糙集理论中属性约简的加速器。 Artif Intell 174（9-10）：597-618

Qian YH, Liang JY, Song P, Dang CY (2010b) On dominance relations in disjunctive set-valued ordered information systems. Int J Inf Technol Decis Mak 9(01):9–33
Qian YH， Liang JY， Song P， Dang CY （2010b）关于分离集值有序信息系统中的支配关系。国际技术决策杂志 9（01）：9–33

Qian J, Miao DQ, Zhang ZH, Li W (2011) Hybrid approaches to attribute reduction based on indiscernibility and discernibility relation. Int J Approx Reason 52(2):212–230
Qian J， Miao DQ， Zhang ZH， Li W （2011）基于不可辨别性和可辨别性关系的属性简化的混合方法。国际 J 近似原因 52（2）：212–230

Robnik-Sˇikonja M, Kononenko I (2003) Theoretical and empirical
Robnik-Sˇikonja M，Kononenko（2003）理论与实证

analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69 Shi Y, Yao L, Xu J (2011) A probability maximization model based
ReliefF 和 RReliefF 的分析。马赫学习 53（1–2）：23–69 ShiY，Yao L，Xu J（2011）基于概率最大化模型

on rough approximation and its application to the inventory problem. Int J Approx Reason 52(2):261–280
关于粗略近似及其在库存问题中的应用。国际 J 近似原因 52（2）：261–280

Shoemaker CA, Ruiz C (2003) Association rule mining algorithms for set-valued data. In: International conference on intelligent data engineering and automated learning, Springer, Berlin,
Shoemaker CA，Ruiz C（2003）集合值数据的关联规则挖掘算法。在：智能数据工程和自动化学习国际会议，Springer，柏林，

pp. 669–676
第 669–676 页

Shu W, Qian W (2014) Mutual information-based feature selection from set-valued data. In: 26th IEEE international conference on tools with artificial intelligence (ICTAI), 2014, IEEE,
Shu W， Qian W （2014）从集合值数据中选择基于互信息的特征。收录于：第 26 届 IEEE 人工智能工具国际会议（ICTAI），2014，IEEE，

pp. 733–739
第 733–739 页

Wang H, Yue HB, Chen XE (2013) Attribute reduction in interval and set-valued decision information systems. Appl. Math. 4(11):1512
Wang H，Yue HB，Chen XE（2013）区间和设定值决策信息系统的属性减少。应用数学4（11）：1512

Data sets in articles. http://www.yuhuaqian.com
文章中的数据集。http://www.yuhuaqian.com

Yang T, Li Q (2010) Reduction about approximation spaces of covering generalized rough sets. Int J Approx Reason 51(3):335–345
Yang T， Li Q （2010）关于覆盖广义粗糙集的近似空间的减少。国际 J 近似原因 51（3）：335–345

Yang QS, Wang GY, Zhang QH, MA XA (2010) Disjunctive set- valued ordered information systems based on variable precision dominance relation. J. Guangxi Normal Univ Nat Sci Ed 3:84–88 Yang X, Zhang M, Dou H, Yang J (2011) Neighborhood systems- based rough sets in incomplete information system. Knowl
YangQS，Wang GY，Zhang QH，MA XA（2010）基于变精度优势关系的析取集值有序信息系统。J. 广西师范大学自然科学版 3：84–88 YangX，Zhang M，Dou H，Yang J（2011）基于邻域系统的粗糙集在不完整的信息系统中。诺尔

Based Syst 24(6):858–867
基础系统24（6）：858–867

Yang X, Song X, Chen Z, Yang J (2012) On multigranulation rough sets in incomplete information system. Int J Mach Learn Cybern 3(3):223–232
Yang X， Song X， Chen Z， Yang J （2012）关于不完整信息系统中的多颗粒粗糙集。国际马赫学习网络杂志 3（3）：223–232

Yao YY (2001) Information granulation and rough set approximation.
YaoYY（2001）信息粒度和粗糙集近似.

Int J Intell Syst 16(1):87–104
国际知识系统杂志16（1）：87–104

Yao YY, Liu Q (1999) A generalized decision logic in interval-set- valued information tables. In: International workshop on rough sets, fuzzy sets, data mining, and granular-soft computing, Springer, Berlin, pp. 285–293
Yao YY， Liu Q （1999）区间集值信息表中的广义决策逻辑。收录于：粗糙集、模糊集、数据挖掘和粒度软计算国际研讨会，Springer，柏林，第 285-293 页

Zadeh LA (1996) Fuzzy sets. In: Fuzzy sets, fuzzy logic, and fuzzy systems: selected papers by Lotfi A Zadeh, pp. 394–432
Zadeh LA （1996）模糊集。收录于：模糊集、模糊逻辑和模糊系统：Lotfi A Zadeh 论文选集，第 394-432 页

Zhang J, Li T, Ruan D, Liu D (2012) Rough sets based matrix approaches with dynamic attribute variation in set-valued information systems. Int J Approx Reason 53(4):620–635
Zhang J， Li T， Ruan D， Liu D （2012）在集合值信息系统中具有动态属性变化的基于粗糙集的矩阵方法。国际 J 大约原因 53（4）：620–635

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
出版商注：施普林格·自然（Springer Nature）对已出版地图中的管辖权主张和机构隶属关系保持中立。