这是用户在 2025-1-7 24:04 为 https://app.immersivetranslate.com/word/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Soft Computing (2020) 24:4675–4691 https://doi.org/10.1007/s00500-019-04228-4
计算 (2020)24:4675–4691 https://doi.org/10.1007/s00500-019-04228-4

A fuzzy similarity-based rough set approach for attribute selection in set-valued information systems
一种基于模糊相似性的粗糙方法,用于集合值信息系统中的属性选择

Shivani Singh1 Shivam Shreevastava2 Tanmoy Som2 Gaurav Somani2
希瓦尼·辛格1希瓦姆·什里瓦斯塔瓦2坦莫伊·索姆2高拉夫·索马尼2

Published online: 23 July 2019
网络发布日期:2019 年 7 月 23

Springer-Verlag GmbH Germany, part of Springer Nature 2019
德国 Springer-VerlagGmbH,Springer Nature2019 的一部分

Abstract
抽象

Databases obtained from different search engines, market data, patients’ symptoms and behaviours, etc., are some common examples of set-valued data, in which a set of values are correlated with a single entity. In real-world data deluge, various irrelevant attributes lower the ability of experts both in speed and in predictive accuracy due to high dimension and insignificant information, respectively. Attribute selection is the concept of selecting those attributes that ideally are necessary as well as sufficient to better describe the target knowledge. Rough set-based approaches can handle uncertainty available in the real-valued information systems after the discretization process. In this paper, we introduce a novel approach for attribute selection in set-valued information system based on tolerance rough set theory. The fuzzy tolerance relation between two objects using a similarity threshold is defined. We find reducts based on the degree of dependency method for selecting best subsets of attributes in order to obtain higher knowledge from the information system. Analogous results of rough set theory are established in case of the proposed method for validation. Moreover, we present a greedy algorithm along with some illustrative examples to clearly demonstrate our approach without checking for each pair of attributes in set-valued decision systems. Examples for calculating reduct of an incomplete information system are also given by using the proposed approach. Comparisons are performed between the proposed approach and fuzzy rough- assisted attribute selection on a real benchmark dataset as well as with three existing approaches for attribute selection on six real benchmark datasets to show the supremacy of proposed work.
不同搜索引擎获得的数据库市场数据、患者的症状行为等集值数据的一些常见例子,其中一组值与单个实体相关联。在现实世界的数据洪流中,由于高维度和不重要的信息,各种不相关的属性分别降低了专家在速度和预测准确性方面的能力。属性选择是选择那些理想情况下是必要且足以更好地描述目标知识的属性的概念。基于粗糙集合的方法可以处理离散化过程后实值信息系统中可用的不确定性。在本文中,我们介绍了一种基于公差粗糙集理论的集合值信息系统中的属性选择新方法。使用相似性阈值定义两个对象之间的模糊容差关系。我们根据依赖程度方法找到归约,用于选择最佳属性子集以便信息系统获得更高的知识。在所提出的验证方法的情况下,建立了粗糙集理论的类似结果。此外,我们提出了一个贪婪算法以及一些说明性示例,以清楚地演示我们的方法,而无需检查设定值决策系统中的每一对属性。使用所提出的方法还给出了计算不完全信息系统还原的示例。 在真实基准数据集上对所提出的方法和模糊粗略辅助属性选择进行了比较,并在六个真实基准数据集上与三种现有的属性选择方法进行了比较,以表明所提出工作的至高无上。

Keywords Set-valued data Rough set Fuzzy tolerance relation Degree of dependency Attribute selection
关键字设置值数据粗糙模糊容忍关系依赖属性选择

Introduction
介绍

Communicated by V. Loia.
V.Loia 传达

& Shivam Shreevastava shivam.rs.apm@itbhu.ac.in

Shivani Singh shivanithakur030@gmail.com
希瓦尼·辛格 shivanithakur030@gmail.com

Tanmoy Som tsom.apm@itbhu.ac.in

Gaurav Somani gaurav.somani.mat14@itbhu.ac.in

1DST-Centre for Interdisciplinary Mathematical Sciences, Institute of Science, BHU, Varanasi 221005, India
1 印度瓦拉纳西 221005 BHU 科学研究所 DST-跨学科数学科学中心

2Department of Mathematical Sciences, IIT (BHU), Varanasi 221005, India
2 印度理工学院 (BHU) 数学科学,印度瓦拉纳西221005

Many real applications in the area of machine learning and data mining consist of set-valued data, i.e. the data where the attribute value of an object is not unique but a set of values, for example, in a venture investment company, the set of evaluation results given by expert (Qian et al. 2010b), the set of languages for each person from the foreign language ability test (Qian et al. 2009), and in a medical database, the set of patients symptoms and activ- ities (He and Naughton 2009), etc. These kinds of infor- mation systems are called set-valued information systems, which are another important type of data tables and are generalized models of single-valued information systems. An incomplete information system (Kryszkiewicz 1998, 1999; Dai 2013; Dai and Xu 2012; Dai et al. 2013; Leung and Li 2003; Yang et al. 2011, 2012) can be con- verted into a set-valued information system by replacing all missing values with the set of all possible values of each
机器学习和数据挖掘领域的许多实际应用都是由定值数据组成的,即对象的属性值不唯一而是一组值的数据,例如,在一家风险投资公司中,专家给出的一组评估结果(Qian et al.2010b),外语能力测试中每个人的语言集(Qian et al. 2009 年),以及在医学数据库中,患者的症状和活动集(He 和 Naughton 2009 年)等。这类信息系统被称为 集值信息系统,它是另一种重要的数据表类型,是单值信息系统的泛化模型。不完整的信息系统(Kryszkiewicz 19981999;Dai 2013;Dai 和 Xu 2012;Dai 等人。 2013 年;Leung 和 Li 2003;Yang 等人。 20112012) 可以通过将所有缺失替换为每个的所有可能值的集合转换为一个集值信息系统

attribute. Set-valued information systems are always used to portray the inexact and lost information in a given dataset, in which the attribute set may vary with time as new information is added.
属性。集合值信息系统始终用于描绘给定数据集中的不精确和丢失信息,其中属性集可能会随着新信息的添加而随时间而变化。

Dataset dimensionality is the main hurdle for the com- putational application in pattern recognition and other machine learning tasks. In many real-world applications, the generation and expansion of data occur continuously and thousands of attributes are stored in databases. Gathering useful information and mining-required knowl- edge from an information system is the most difficult task in the area of knowledge-based system. Not all attributes are relevant to the learning tasks as they reduce the real performance of proposed algorithms and increase the training and testing times. In order to enhance the classi- fication accuracy and knowledge prediction, attribute sub- set selection (Hu et al. 2008; Yang and Li 2010; Qian et al. 2010a; Qian et al. 2011; Shi et al. 2011; Jensen and Shen 2009; Jensen et al. 2009) plays a key role via eliminating redundant and inconsistent attributes. Attribute selection is the process of selecting the most informative attributes of a given information system to reduce the classification time, complexity and overfitting.
数据集维数是模式识别和其他机器学习任务中计算应用的主要障碍。在许多实际应用程序中,数据的生成和扩展是连续发生的,并且数据库中存储了数千个属性。从信息系统中收集有用的信息和挖掘所需的知识是知识系统领域中最困难的任务。并非所有属性都与学习任务相关,因为它们会降低所建议算法的实际性能并增加训练和测试时间。为了提高分类准确性和知识预测,属性子集选择(胡等人。 2008Yang 和Li 2010Qian et al. 2010 年a;Qian 等人。 2011 年;Shi et al. 2011 年;Jensen 和 Shen 2009;Jensen 等人。 2009 年)通过消除冗余和不一致的属性发挥了关键作用。属性选择是选择给定信息系统信息最大的属性以减少分类时间、复杂性和过度拟合的过程

Rough set approximations (proposed by Pawlak 1991 and Pawlak and Skowron 2007a, b, c) are the central point of approaches to knowledge discovery. Rough set theory (RST) uses only internal information and does not depend on prior model conventions, which can be used to extract and signify the hidden knowledge available in the infor- mation systems. It has many applications in the fields of decision support, document analysis, data mining, pattern recognition, knowledge discovery and so on. In rough sets, several discrete partitions are needed in order to tackle real-valued attributes and then dependency of decision attribute over conditional attributes is calculated. The intrinsic error due to this discretization process is the main issue while computing the degree of dependency of real- valued attributes.
粗略集近似值(由 Pawlak 1991和 Pawlak 和 Skowron 2007abc 提出)是知识发现方法的中心点。粗糙集论 (RST) 仅使用内部信息,不依赖于先前的模型约定,这些信息可用于提取和表示信息系统中可用的隐藏知识。它在决策支持、文档分析、数据挖掘、模式识别、知识发现等领域有许多应用。在粗略集中,需要几个离散分区来处理实值属性,然后计算决策属性对条件属性的依赖性。在计算实值属性的依赖程度时,由于这种离散化过程导致的内在误差是主要问题。

Dubois and Prade (1992) combines a fuzzy set (Zadeh 1996) with rough set and proposed a fuzzy rough set to provide an important tool in reasoning with uncertainty for real-valued datasets. Fuzzy rough sets combine distinct concepts of indiscernibility (for rough sets) and vagueness (for fuzzy sets) available in the datasets and successfully applied to many fields. However, very few researchers are working in the area of set-valued information systems under the framework of rough set model in fuzzy environment.
Dubois 和 Prade (1992) 将模糊集 (Zadeh 1996) 与粗糙集相结合,并提出了模糊粗糙集,为实值数据集的不确定性推理提供了重要工具。模糊粗糙集结合了数据集中可用的不可辨别性(对于粗糙集)和模糊性(对于模糊集)的不同概念,并成功应用于许多领域。然而,很少有研究人员在模糊环境下的粗糙集模型框架下从事集值信息系统领域的工作

In this paper, we introduce a novel approach for attribute selection in set-valued information system based on toler- ance rough set theory. We define a fuzzy relation between two objects of a set-valued information system. A fuzzy tolerance relation is introduced by using a similarity
在本文中我们介绍了一种基于容差粗糙集理论的集合值信息系统中的属性选择新方法。我们定义了一个集合值信息系统的两个对象之间的模糊关系。通过使用相似度引入模糊容差关系

threshold to avoid misclassification and perturbation in order to tackle uncertainty in a much better way. Based on this relation, we calculate tolerance classes of each object to determine lower and upper approximations of any subset of the universe of discourse. Positive region of decision attribute over a subset of conditional attributes can be calculated using lower approximations. Degree of depen- dency of decision attribute over a subset of conditional attributes is the ratio of cardinality of positive region and cardinality of the universe of discourse. Analogous results of rough set theory are established in case of our proposed method for validation. Moreover, we present a greedy algorithm to clearly demonstrate our approach without calculating degree of dependencies for each pair of attri- butes. Illustrative example datasets are given for better understanding of our proposed approach. We compare the proposed approach with other existing approaches on real datasets and test the statistical significance of the obtained results.
threshold 来避免错误分类和扰动,以便以更好的方式解决不确定性。基于这种关系,我们计算每个对象的容忍等级,以确定话语宇宙的任何子集下限上限近似值。可以使用较低的近似值来计算条件属性子集上的决策属性的正区域。决策属性对条件属性子集的依赖程度是正区域基数与话语宇宙基数的比率。在我们提出的验证方法的情况下,建立了粗糙集理论的类似结果。此外,我们提出了一种贪婪算法来清楚地演示我们的方法,而无需计算每对属性的依赖程度。给出了说明性示例数据集,以便更好地理解我们提出的方法。我们在真实数据集上将所提出的方法与其他现有方法进行比较,并测试所获结果的统计意义

The rest of the paper is structured as follows. Related works are given in Sect. 2. In Sect. 3, basic definitions related to incomplete and set-valued information systems are given. The proposed concept for set-valued datasets is presented and thoroughly studied in Sect. 4. In Sect. 5, analogous results of rough set theory are verified for the new proposed approach. An algorithm for attribute selec- tion of set-valued information system is presented in Sect. 6. Illustrative examples with comparative analysis are given to demonstrate proposed model in Sect. 7. In Sect. 8, experimental analysis is performed on six real benchmark datasets. Section 9 concludes our work.
本文的其余部分结构如下。相关著作在 Sect.2.在第3、给出了与不完备和集值信息系统相关的基本定义。在 Sect 中提出并深入研究了集值数据集的拟议概念。4.在第5,为新提出的方法验证了粗糙集理论的类似结果。Sect.介绍了一种用于设置信息系统属性选择的算法6给出了带有比较分析的说明性示例以演示Sect. 7第 7 节中。 8、对 6 个 Real Benchmark 数据集进行实验分析。第 9结束了我们的工作。

Related works
相关作品

Nowadays, set-valued datasets are generated through many sources. Dimensionality reduction is a key issue for such type of datasets in order to reduce complexity, time and cost. Different criteria have been proposed by a few researchers to deal with set-valued datasets and to evaluate the best suitable attributes in the process of attribute selection. Lipski (1979, 1981) gave the idea of representing an incomplete information system as a set-valued infor- mation system and studied their basic properties. He also investigated the semantic and logical problems often occur in an incomplete information system. Concepts of internal and external interpretations are introduced in the paper. Internal interpretation is shown to lead towards the notion of topological Boolean algebra and a modal logic, whereas external interpretation is related to referring queries directly to reality leads towards Boolean algebra and classical logic. Orlowska and Pawlak (1984) established a method to deal with non-deterministic information system
如今,集值数据集是通过许多来源生成的。降维是此类数据集的一个关键问题,以降低复杂性、时间和成本。一些研究人员提出了不同的标准来处理设定值数据集,并在属性选择过程中评估最合适的属性。Lipski19791981 提出了不完备信息系统表示为固定值信息系统的想法,并研究了它们的基本性质。他还研究了不完整信息系统中经常出现的语义和逻辑问题。本文介绍了内部和外部解释的概念。内部解释被证明会导致拓扑布尔代数和模态逻辑的概念,而外部解释与将查询直接引用到现实有关,导致布尔代数和经典逻辑。Orlowska 和 Pawlak (1984) 建立了一种处理非确定性信息系统的方法

which is considered as set-valued data. They defined a language in order to define non-deterministic information and introduced the concept of knowledge representation system.
被视为集值数据。他们定义了一种语言来定义非确定性信息,并引入了知识表示系统的概念

A generalized decision logic, which is an extension of decision logic studied by Pawlak, in interval set-valued information system is presented by Yao and Liu (1999). They introduced two types of satisfiabilities of a formula, namely interval degree truth and interval-level truth. They also proposed generalized decision logic DGL and inter- preted this concept based on two types of satisfiabilities. A detailed discussion on inference rules is also presented. Yao (2001) presented a concept of granulation for a uni- verse of discourse in set-valued information systems and reviewed the corresponding approximation structure. The concept of ordered granulation and approximation struc- tures are used in defining stratified rough set approxima- tions. He first defined a nested sequence of granulations and then corresponding nested sequence of rough set approximations, which leads to a more general approxi- mation structure.
Yao 和 Liu (1999提出了一种广义决策逻辑,它是 Pawlak 研究的决策逻辑的扩展,在区间集值信息系统中。他们引入了公式的两种满足性,即区间度真值和区间级真值。他们还提出了广义决策逻辑 DGL,并根据两种类型的满足性对这一概念进行了解释。此外,还详细讨论了推理规则。Yao (2001) 提出了一个用于集合值信息系统中单宇宙话语的颗粒化概念,并回顾了相应的近似结构。有序粒化和近似结构的概念用于定义分层粗集近似值。他首先定义了一个嵌套的颗粒序列,然后定义了相应的粗略集近似的嵌套序列,这导致了更通用的近似结构。

Shoemaker and Ruiz (2003) introduced an extension of apriori algorithm that is able to mine association rules from a set-valued data. They introduced two different algorithms for mining association rules from set-valued data and compared their outcomes. They established a system based on one of these algorithms and applied it on some bio- logical datasets for justification. Set-valued information systems were presented by Guan and Wang (2006). To classify the universe of discourse, they proposed a toler- ance relation and used maximal tolerance classes. They introduced the concept of relative reduct of maximum tolerance classes and used Boolean reasoning technique for calculating relative reduct by defining a discernibility function. The concepts of E-lower, A-upper and A-lower relative reducts for set-valued decision systems are also discussed in details.
Shoemaker 和 Ruiz (2003) 引入了 apriori 算法的扩展,该算法能够集合值数据中挖掘关联规则他们引入了两种不同的算法,用于从设定值数据中挖掘关联规则,并比较了它们的结果。他们基于其中一种算法建立了一个系统,并将其应用于一些生物数据集以进行合理性验证。集合值信息系统由 Guan 和 Wang (2006) 提出。为了对话语世界进行分类,他们提出了一种容忍关系并使用了最大容忍类别。他们引入了最大公差等级的相对还原的概念并使用布尔推理技术通过定义可辨别函数来计算相对还原。还详细讨论了设定值决策系统的 E-lower、A-upper 和 A-lower 相对还原的概念。

For the conjunctive/disjunctive type of set-valued ordered information systems, a dominance-based rough set approach was introduced by Qian et al. (2010b). This model is based on substitution of indiscernibility relation by a dominance relation. They also developed a new approach to sorting for objects in disjunctive set-valued ordered information systems. This approach is useful in simplifying a disjunctive set-valued ordered information system. Criterion reduction for a set-valued ordered information system is also discussed. Based on variable precision relation, Yang et al. (2010) generalized the notion of Qian et al. by defining an extended rough set model and propounded variable precision dominance relation for set- valued ordered information systems. They presented an attribute reduction method based on the discenibility matrix approach by using their proposed relation. 

Zhang et al. (2012) proposed matrix approaches based on rough set theory with dynamic variation of attributes in set-valued information systems. In this paper, they defined the lower and upper approximations directly by using basic vector generated by the relation matrix in the set-valued information system. The concept of updation of the lower and upper approximations is also introduced by use of the variation of the relation matrix. Luo et al. (2013) investi- gated the updating mechanisms for computing lower and upper approximations with the variation of the object set. Authors proposed two incremental algorithms for the updation of the defined approximations in disjunc- tive/conjunctive set-valued information systems. After experiments on several datasets for checking the perfor- mance of the proposed algorithms, they showed that the incremental approaches are way better than the non-in- cremental approaches. 

Wang et al. (2013) defined a new fuzzy preference relation and fuzzy rough set technique for disjunctive-type interval and set-valued information systems. They dis- cussed the concept of relative significance measure of conditional attributes in interval and set-valued decision systems by using the degree of dependency approach. In this paper, authors mainly focused on semantic interpre- tation of disjunctive type only. They also presented an algorithm for calculating fuzzy positive region in interval and set-valued decision systems. 

An incremental algorithm was designed to reduce the size of dynamic set-valued information systems by Lang et al. (2014). They presented three different relations and investigated their basic properties. Two types of discerni- bility matrices based on these relations for set-valued decision systems are also introduced. Furthermore, using the proposed relations and information system homomor- phisms, a large-scale set-valued information system is compressed into a smaller information system. They addressed the compression updating via variations of fea- ture set, immigration and emigration of objects and alter- ations of attribute values. 

In set-valued ordered decision systems, Luo et al. (2014) worked on maintaining approximations dynamically and studied the approximations of decision classes by defining the dominant and dominated matrices via a dominance relation. The updating properties for dynamic maintenance of approximations were also introduced, when the evolu- tion of the criteria values with time occurs in the set-valued decision system. Firstly, they constructed a matrix-based approach for computing lower and upper approximations of upward and downward unions of decision classes. Fur- thermore, incremental approaches for updating approxi- mations are presented by modifying relevant matrices without retraining from the start on all accumulated train- ing data. 

Shu and Qian (2014) presented an attribute selection method for set-valued data based on mutual information of the unmarked objects. Mutual information-based feature selection methods use the concept of dependency among features. Unlike the traditional approaches, here the mutual information is calculated on the unmarked objects in the set-valued data. Furthermore, mutual information-based feature selection algorithm is developed and implemented on an universe of discourse to fasten the feature selection process. Due to the dynamic variation of criteria values in the set-valued information systems, Luo et al. (2015) pre- sented the properties for dynamic maintenance of approx- imations. Two incremental algorithms for modernizing the approximations in disjunctive/conjunctive set-valued information system are presented corresponding to the addition and removal of criteria values, respectively. 

Most of the above approaches are based on classical and rough set techniques, which have their own limitations of discretization, which leads to information loss. Rough set in fuzzy environment-based methods deal with uncertainty as well as noise available in information system in a much better way as compared to classical and rough set-based approaches without requirement of any discretization pro- cess. Dai et al. (2013) defined a fuzzy relation between two objects and constructed a fuzzy rough set model for attri- bute reduction in set-valued information systems based on discernibility matrices. In this paper, the similarity of two objects in set-valued information system is taken up to a threshold value in order to avoid misclassification and perturbation and a tolerance rough set-based attribute selection is presented by using degree of dependency approach. 

Preliminaries 

In this section, we describe some basic concepts, symbol- ization and examples of set-valued information system. 

Definition  3.1 (Huang  1992  ) A  quadruple  IS  ¼  ð  U  ; AT  ; V  ; h  Þ  is  called  an  Information  System,  where  U   ¼ f  u  1; u  2; .. .  ; u  ng  is a non-empty finite set of objects, called   the  universe of discourse, AT   ¼ f  a  1; a  2; .. .  ; a  ng   is a non- empty  finite  set  of
attributes.
属性。
V ¼ a2AT
Va where
哪里
Va is
the set
套装
of
attribute
属性
values
associated
相关
with
each
attribute
属性
a 2

AT and h : U AT ! V is an information function that assigns particular values to the objects against attribute set such that 8a 2 AT,8u 2 U; hðu; aÞ 2 Va
ATh UAT V是一个信息函数,它根据属性集为对象分配特定值,以便 8a 2 AT,8 u 2 U;hðu;aþ 2 a
.

Definition 3.2 (Huang 1992): In an information system, if each attribute has a single entity as attribute value, then it is called single-valued information system, otherwise it is known as set-valued information system. Set-valued
定义3.2(Huang1992):在一个信息系统中,如果每个属性都有一个实体作为属性值,那么被称为单值信息系统,否则称为集合值信息系统。Set-value (设置

Table 1
1

Set-valued information system
集值信息系统

U

c1c2

c3

c4

u1
1

{1,2,3,4}{0,1}

{1,2}

0.4

u2
两个

{2,3}{2,3}

{1}

0.5

u3
三个

{1,2,3,4}{1,2}

{1,2}

0.9

u4
四个

{2,3,4}{0,1,2,3}

{0,1)

0.2

u5
5

{2,4}{0,1,2}

{0,1}

1

information system is a generalization of the single-valued information system, in which an object can have more than one attribute values. Table 1 illustrates a set-valued infor- mation system.
信息系统是单值信息系统的泛化,其中对象可以具有多个属性值。表 1说明了一个固定值信息系统。

Definition 3.3 (Guan and Wang 2006): An information system IS ¼ ðU; AT; V ; hÞ is said to be a set-valued deci- sion system if AT ¼ C [ D where C is a non-empty finite set of conditional attributes and D is a non-empty collec- tion of decision attributes with C \ D ¼ ;. Here V ¼ VC [ VD with VC and VD as the set of conditional attribute values and decision attribute values, respectively. h be a mapping from U C [ Dto V such that h : U C ! 2VC is a set-valued mapping and h : U C ! VD is a single- valued mapping. Table 2 exemplifies a set-valued decision system.
定义 3.3(Guan 和 Wang 2006):信息系统IS1/4ðU;AT;;如果 AT 1/4C[D,则称 hÞ为集值决定系统其中 C 是非空的有限条件属性集,D C\D 为 1/4 的非空决策属性集合;这里V1/4 VC[VD,其中 VCVD分别作为条件属性值和决策属性值的集合。 h 是从UC [DV 的映射使得hUC2VC固定值映射,h UC VD是单值映射。表 2举例说明了一个设定值决策系统。

To give a semantic interpretation of the set-valued data, many ways are given (Guan and Wang 2006), here we encapsulate them as two types. In Type1, hðu; aÞ is inter- preted conjunctively, and in Type2, hðu; aÞ is interpreted disjunctively. For example, if a is an attribute, ‘‘speaking a language’’, then hðu; aÞ = {Chinese, Spanish, English} can be inferred as: u speaks Chinese, Spanish and English in case of Type1 and u speaks Chinese or Spanish or English,
为了给出集合值数据的语义解释,给出了许多方法(Guan 和 Wang 2006),这里我们将它们封装为两种类型。在 Type1hðu;aÞ 是连词,在 Type2hðu;aÞ 是析取解释的。例如如果a是一个属性,''说一种语言'',那么hðu;aÞ={中文,西班牙语,英文}可以推断为:Type1 的情况下,u 会说中文、西班牙语和英语,而u会说中文西班牙语英语,

i.e. u can speak only one of them in case of Type2. Incomplete information systems with some unknown attribute values or partially known attribute values are of Type2 set-valued information system.
即, Type2 的情况下,您只能说其中一种。具有一些未知属性值或部分已知属性值的不完整信息系统属于 Type2 集值信息系统。

In many real-world application problems, lots of missing data existed in the information system due to ambiguity and incompleteness. All missing values presented in
在许多实际应用问题,由于模糊性和不完整性信息系统中存在大量缺失数据中显示的所有缺失

Table 2 Set-valued decision system
2设置值决策系统

U

c1

c2

c3

c4

D

u1
1

{1,2,3,4}

{0,1}

{1,2}

0.4

1

u2
两个

{2,3}

{2,3}

{1}

0.5

1

u3
三个

{1,2,3,4}

{1,2}

{1,2}

0.9

2

u4
四个

{2,3,4}

{0,1,2,3}

{0,1)

0.2

1

u5
5

{2,4}

{0,1,2}

{0,1}

1

2

Table 3 Missing value dataset
3缺失数据集
_

For B AT, a tolerance relation is defined as
对于BAT,公差关系定义为

2_\ _

TB ¼ui; uj jbðuiÞ \ b uj /; 8b 2 B ¼Tbð2Þ
TB1/4uubðuÞbu61/4/8b2B1/4Tbð2Þ

b2B
2

Table 4 Set-valued decision system obtained from Table 3
43 得到的设定值决策系统

4

5

information system can be characterized by the set of all possible values of each attribute. This type of information systems can also be considered as a special case of set-
信息系统可以通过每个属性的所有可能值的集合来表征。这种类型的信息系统也可以被视为set-

where ui; uj 2 TB implies that ui and uj are indiscernible (tolerant) with respect to a set of attributes B:
其中u;u2 TB意味着 uu对于一组属性 B 是不可辨别的(宽容的):

Example 4.1 Let ðU; AT; V; hÞ be a set-valued information system with b 2 AT and u1; u2; u3 2 U such that bðu1Þ ¼ fv1; v2; v3; v2g; bðu2Þ ¼ fv4; v5; v6; v7g.and bðu3Þ ¼ fv1; v2; v3g: Then, by Definition 4.1, we say that both ðu1; u2Þ and ðu1; u3Þ belong to Tb, that is, u1; u2 are indiscernible with respect to attribute b and u1; u3 are indiscernible with respect to attribute b simultaneously.
4.1ðu;AT;;hÞ是一个集合值信息系统,b 2 ATu1;u2;u32 U使得 bðu1Þ1/4fv1; 2; 3; v2克; bðu2Þ1/4fv4; 5 节; 6; v7 bðu3Þ1/4fv1;2;v3g:那么,根据定义 4。1,我们说两者都ðu1;u2Þ ðu1;u3Þ 属于 Tb,即 u1;u2 对于属性 b u1 是无法区分的;u3 同时对于属性 B 是无法辨别的。

It is obvious from above example that discernibility of u1 and u3 is more difficult than discernibility of u1 and u2, but Definition 4.1 is not able to describe the extent to which two objects are related. To overcome this issue, we define a fuzzy relation for a set-valued dataset.
从上面的例子中可以明显看出,u1u3 可辨别性比 u1u2 的可辨别性更困难,但定义4.1无法描述其程度哪个两个对象相关为了克服这个问题我们为一个集值数据集定义了一个模糊关系。

Definition 4.2 Let SVIS ¼ ðU; AT; V; hÞ; 8b 2 AT be a set-valued information system, then we define a fuzzy
定义4.2设 SVIS1/4ðU;AT;;hÞ; 8b2AT 为一个集合值信息系统,定义一个模糊

-

relation Rb as:
关系Rb为:

2 bðu Þ \ b u

l_ ui; u
你你
j

¼ jbðu Þj þ b uð3Þ
1/4bðuÞþbuð3Þ

of objects are missing. Table 4 illustrates the transforma-
的对象缺失4说明了transforma-

tion of incomplete information system into a set-valued information system.
不完整的信息系统转化为固定值的信息系统。

For a set of attributes B
对于一组属性 B

defined as
定义为

-

A, a fuzzy relation RB can be
A模糊关系 R,B可以是

l_ ui; uj ¼ inf l_ ui; ujð4Þ
luu 1/4influuð4Þ

Proposed methodology
建议的方法

In this section, we define a new kind of fuzzy relation
本节我们定义了一种新的模糊关系

Example 4.2 (Continued from Example 4.1). After cal- culating degree of similarity by using fuzzy relation as defined in Eq. (3), we get
例 4.2(接例 4.1)。使用方程 (3) 中定义的模糊关系计算相似度,我们得到

between two objects and supremacy over previously defined relation is shown using an example. Then, lower and upper approximations are defined using a threshold value on fuzzy similarity degree. Some basic properties of above-defined lower and upper approximations are also
Between two objects 和 prepremacy over previously defined relation 通过示例来显示。然后,使用模糊相似度的阈值定义下限和上限近似值。上面定义的lowerupper近似的一些基本属性也是

l_ ðu1; u2Þ ¼
lðu1u2Þ1/4

Rb

l_ ðu1; u3Þ ¼
lðu1u3Þ1/4

Rb

2jbðu1Þ \ bðu2Þj jbðu1Þj þ jbðu2Þj 2jbðu1Þ \ bðu3Þj jbðu1Þj þ jbðu3Þj
2bðu1Þbðu2Þj bðu1Þj þ bðu2Þj 2bðu1Þbðu3Þj jbðu1Þjþbðu3Þj

2

¼ 8 ¼ 0:25
1/481/4025

6

¼ 7 ¼ 0:86
1/471/4086

discussed.
讨论。

Fuzzy relation between objects
对象之间的模糊关系

In this subsection, first we present the definition of toler- ance relation available in the literature and then propose a new fuzzy relation between two objects. In continuation, we compared both the definition through an example.
在本节中,我们首先介绍了文献中可用的公差关系的定义,然后提出了两个对象之间的新模糊关系。接下来,我们通过一个例子比较了这两个定义。

Definition 4.1 (Orlowska (1985), Yao (2001)) For a set- valued information system IS ¼ ðU; AT; V ; hÞ; 8b 2 AT and ui; uj 2 U; tolerance relation is defined as
定义 4.1(Orlowska (1985), Yao (2001)) 对于一个有值信息系统,IS1/4ðU;AT;;hÞ;8b2AT 和 u;u2 u; tolerance 关系定义为

Now, we can easily calculate the degree to which two
现在,我们可以很容易地计算两个

objects are discernible.h
物体可辨别的。h

Fuzzy tolerance relation-assisted rough approximations
模糊容差关系辅助略近似值

In this subsection, a fuzzy tolerance relation using a threshold value is defined and lower and upper approxi- mations of a set are presented
在本小节中,定义了使用阈值的模糊容差关系并给出了一组的下限和上限近似

If we ignore some misclassification and perturbation by using a threshold value on fuzzy relation between two objects as given in Eq. (3), then the involvement of fuzzy
如果我们使用方程3中给出的两个对象之间的模糊关系的阈值来忽略一些错误分类和扰动那么模糊

sets in the computation of fuzzy lower approximation will increase and fuzzy positive region enlarges. Thus, the knowledge representation ability becomes much stronger with respect to misclassification.
模糊下限近似计算中的集合将增加,而模糊正区域将扩大。因此,知识表示能力在 错误分类方面变得更加强大。

So we define a new kind of binary relation using a
因此,我们使用

threshold value a as follows:
阈值A如下所示

attribute b: So, our proposed definition gives more precise tolerance relation than previous ones.
属性 b因此,我们提出的定义给出了比以前的定义更精确的公差关系。

Now, we define tolerance classes for an object ui with respect to b 2 B as follows:
现在,我们定义对象 u相对于 b 2 B 的容差类,如下所示:

a_a
一个

½ ðu Þ ¼ u 2 Ujð7Þ
1/2ðuÞ1/4u2Uj ð7Þ

an o
一个no

Tbi

juiTbu
乌特布
j

where a 2 ð0; 1Þ is a similarity threshold, which gives a level of similarity for insertion of objects within tolerance
其中a2ð0;1Þ相似度阈值,它给出了容差范围内插入对象的相似度

½_ a ðu Þ ¼ nu
1/2aðuÞ1/4nu

-a

j 2 UjuiTbu
2UuTb u
j;

8b 2 Boð8Þ

classes.
类。

For a set of attributes B AT; we define binary relation
对于一组属性 BAT;我们定义二元关系

Then we propose, lower and upper approximations of any object set X U as:
然后我们提出,任何对象集 XU lowerupper近似为:

as:
如:

anh_ aio
一个nhao

T ¼ n ui; u
T1/4nuu
j

jl_

ui; u
uu u
j

ao
一个O

ð6Þ

Tb # X ¼ ui 2 Uj Tb
Tb#X1/4u2UTb

ðuiÞ X

ð9Þ

B

where l
其中 l
_

Rb

Rb

ui; u
uu u
j

and l
和 l
_

RB

ui; u
uu u
j

are defined by Eqs. (3)
Eqs 定义3

a " X ¼ nu
aX1/4nu

-a

Tb ðuiÞ \ X /
TbðuÞX61/4/

a_a
一个

ð10Þ

and (4), respectively.
4)。

Definition 4.3A fuzzy binary relation R~ ðu ; u Þ between
定义4.3模糊二元关系R~ðu;uÞ之间

The tuple \Tb # X; Tb " X [ is called a tolerance rough set.
元组\Tb#X;TbX[称为公差粗略集。

b i j

objects ui; uj 2 U is said to be a fuzzy tolerance relation if it is reflexive ði:e: R~bðui; uiÞ¼ 1; 8ui 2 UÞ and symmetric i:e: R~bðui; ujÞ ¼ R~bðuj; uiÞ; 8ui; uj 2 U
对象 u;u2如果 U反身ðieR~bðu则称其为模糊容差关系;uÞ1/41;8u2UÞ对称 ieR~bðu;uÞ1/4R~bðu;uÞ;8u;2

-a

Lemma 4.1 Tb is a tolerance relation.
引理4.1Tb是一个公差关系。

Proof
证明

Reflexive:
反身:

Properties of lower and upper approximations
下限和上限近似的属性

In this subsection, we will examine the results on lower and upper approximations equivalent to Dubois and Prade (1992) for our proposed approach
在本小节我们将检查我们提出的方法等效于 Dubois 和 Prade (1992) 的 lower 和 upper 近似的结果

Let ðU; C [ D; V ; hÞ be a set-valued decision system.
ðU;C[D;;hÞ是一个设定值决策系统。

Let B C and X U,a 2 ð0; 1Þ
bcxua2ð01þ

-a_a
一个

2 bðuiÞ \ b uj

Theorem 4.1TB # X X TB " X
定理4.1TB#XX XTBX

*l_ ui; uj ¼ ) l_ ðui; uiÞ jbðu Þj þ b u
*lu; u1/4 lðuuÞ jbðu Þþbu

ah a
一个HA
i

¼ ii ¼ i ¼ 1 a
1/4 1/4 1/41

h_ a
a
i

jbðuiÞj þ jbðuiÞjjbðuiÞj

-a

Since x 2 TB ðxÞ ) x 2 X
自 x 2 TB ðxÞ ) x 2 x
:

-a

) ðui; uiÞ 2 Tb
ðuuÞ2Tb

Therefore, TB # X X
因此,TB # X X
:

Now, let x 2 X, since x 2 h_ a iðxÞ ) h_ a iðxÞ\ X
现在,x2Xsincex2haðxÞhaðxÞX61/4

Symmetric:
对称:

Let u ; u 2 _ a
让你两个

-a

u ) x 2" X
ux 2 英寸 X

TBTB

i jTb

l

2jbðuiÞ\bðujÞja

TB

-a

Therefore, X" X
因此,XX

Now, l
现在,l
_

uj; ui ¼
1/4

2jbðujÞ\bðuiÞj

¼ l
1/4 升
_

ui; uj a

Hence, TB # X X TB " Xh
因此,TB#XX TBXh

Rb

) uj; u
i

a

2 Tb
2b

jbðujÞjþjbðuiÞjRb
bðuÞþ bðuÞRb

Theorem 4.2Let B1
定理4.2B1

B2
2

C, then
C,

-a_a_a
一个AA

Therefore, Tb is a fuzzy tolerance relation.h(i)TB1 # X TB2 # X
因此,Tb是一个模糊容差关系。h(i)TB# XTB# X

-a_a
一个

Example 4.3 If we take a ¼ 0:3 in Eq. (5) and apply it on
4.3如果我们方程5中的1/403 并将其应用于

Example 4.1, then we can see that only ðu1; u3Þ belongs to
4.1那么我们可以看到只有ðu1;u3Þ属于

-

Tb, i.e. only u1 and u3 are indiscernible with respect to
Tb只有u1u3

(ii)TB2 " X TB1 " X
(ii)TBXTBX

Proof
证明

-a

ðiÞ Let x 2 T
ð Þx2T

-a

# X then TðxÞ X
#X然后TðxÞX
;

8 *B1

B2
2

B1B1
11

> h_ a ino

* B B ) h_ a iðxÞ h_ a iðxÞ
*BBhaðxÞhaðxÞ

l_ ðx; yÞ a; 8b 2 B1

>< TB ðxÞ ¼ y 2 U
><TBðxÞ1/4y2U
j

12TB
1 2结核病
2

hi
I

TB
结核病
1

h_ a ino
哎呀

-a

Thus; T
因此;

-a

ðxÞ X ) x 2 T
ðxÞXx2

# X:

TB
结核病
2

>

ðxÞ ¼ y 2 Ujl_ ðx; yÞ a; 8b 2 B2
ðxÞ1/4y2UlðxyÞa8b2B2

Rb

B2B2

>:h_ a i

h_ a
a
i

-ah_ a
一个HA
i

Since; B B ) h_ a iðxÞ h_ a iðxÞ
由于BBhaðxÞhaðxÞ

Proofy 2 _ a # ðXCÞ , h_ a iðyÞ XC , h_ a iðyÞ\ X
证明y2a#ðXCÞhaðyÞXChaðyÞX

¼ u , y 62 TB " ðXÞ
1/4uy62TBðXÞ

12

h_ a
a
i

TB
结核病
2

TB
结核病
1

a

, y 2 _ a " ðXÞ
y 2 a “ ðXÞ

a

. Hence, TB # ðXC
因此,TB#ðXC

Þ¼ _ a " ðXÞ h
Þ1/4aðXÞh

Thus; T B1 ðxÞ\ X u ) x 2 TB1 " X
ThusT BðxÞ\ X 61/4 u ) x 2 TB“ X
:

-a_a
一个

Hence; TB2 " X TB1 " X
因此TBXTBX

Theorem 4.5Let Y U be another set of objects, then following properties hold.
定理 4.5 YU另一对象,以下性质成立。

h

-a_a_a
一个AA

Theorem 4.3Let a1 a2, then
定理4.3a 1a2然后

TB # ðX \ YÞ ¼ TB # ðXÞ \ TB # ðYÞ
TB#ðX\YÞ1/4TB#ðXÞ\TB#ÐYÞ

-a_a_a
一个AA

- a1

- a2

TB " ðX [ YÞ ¼ TB " ðXÞ [ TB " ðYÞ
TBðX[YÞ1/4TBðXÞ[TBðYÞ

TB # X TB # X

(ii)
(二)

a2

a1

Proof
证明

TB " X TB " X
TBXTBX

z 2 _ a # ðX \ YÞ , h_ a iðzÞ X \ Y ,

Proof
证明

ah a
一个HA
i

TB

h_ a
a
i

TB

h_ a
a
i

h_ a1 ino
哎呀

, z 2 TB # ðXÞ and z 2 TB # ðYÞ , z 2 TB #
z2TB#ðXÞz2TB#ðYÞz2TB#

since TB
TB

h_ a
a
2 i

ðxÞ ¼ y 2 Ujl_ ðx; yÞ[ a1; 8b 2 B
ðxÞ1/4y2UlðxyÞ[a18b2B

no
o

-a

ðXÞ\ TB # ðYÞ:
ðxþ\tb#ðyþ:

) h_ a2 iðxÞ h_ a1 iðxÞ ðSince; a a Þ
haðxÞhaðxÞðSinceaaÞ

ah a
一个HA
i

) h_ a2 iðxÞ X ðh_ a1 iðxÞ XÞ
haðxÞXðhaðxÞXÞ

ð11Þ

h_ a i h_ a
i

- a2

) x 2# X
x2 #X

h_ a
a
i

h_ a
a
i

TB

- a1

a2

EitherTB
要么TB

-a

ðzÞ\ X u or TB ðzÞ\ Y u
X 61/4 u 或 TB Y 61/4 u
,

-a

Hence, TB # X TB # Xeither z 2 TB " ðXÞ or z 2 TB " ðYÞ:
因此,TB# XTB#X要么z2TBðXÞ要么z2TBðYÞ:

- a2

Let y 2" X, then
y2X然后

h_ a2 ih_ a
1 i

h_ a2 iðxÞ\ X u. Since,
haðxÞX61/4u自从,

-a_a
一个

Therefore, z 2" ðXÞ [" ðYÞ
因此,z2ðxþ[ðyþ

Hence, TB " ðX [ YÞ¼ TB " ðXÞ[ TB " ðYÞ
因此,TBðxyþ1/4TBðxþ[TBðyþ

) h_ a1 iðxÞ\ X u ) y 2 _ a1 " X;h
haðxÞX61/4uy2aXh

TB

- a2

TB

a1

Theorem 4.6
定理4.6

Hence, TB " X TB " X
因此,TBXTBX

-a_a_a_a
一个AAA

-a

Theorem 4.4XC
定理4.4XC

notes complement of set X
集合 X 的注释补充
.

_ a

h

X C, where XC de-
XC其中XCde-

TB # ðUÞ ¼ U ¼ TB " ðUÞ and TB # ðuÞ ¼ u ¼ TB " ðuÞ
TB#ðUÞ1/4U1/4TBðUÞTB#ðuÞ1/4u1/4TBðuÞ

ProofEasy to check.h
证明易于检查。h

Theorem 4.7
定理4.7

-a h_ a
一个HA
i

h_ a
a
i

a h_ a
一个HA
i

Proof
证明

a_a
一个

Theorem 4.9
定理4.9

ðiÞ _ a # _ a # ðXÞ ¼ _ a # ðXÞ
ðþa#a#ðxþ1/4a#ðxþ

a # h_ a iðxÞ h_ a iðxÞ _ a " h_ a iðxÞ ðiiÞ _ a " _ a " ðXÞ ¼ _ a " ðXÞ
a#haðxÞhaðxÞahaðxÞ ðiiÞaaðXÞ1/4aðXÞ

hence TBTB
因此TBTB

TBTBTB
TBTBT B 公司

ð12Þ

TBTBTB
TBTBT B 公司

Now, we have to show that,
现在,我们必须证明

-a

ProofðiÞ Since; T

# ðXÞ X; now; replacing X by
#ðXÞX;现在;X替换为

h_ a i_ a h_ a
i

h_ a i_ aaB
haab

h_ a
a
i

aaa
一个AA

h_ a
a
i

TB # TB # ðXÞ TB # ðXÞð18Þ

RbNow; let y 2TB # ðXÞ; we haveto show that y 2 TB #
Rb现在lety2TB#ðXÞ;我们必须证明y2TB#

h_ a i _ a

If y 2 TB
如果y2TB

a

ðzÞ; then l_ ðz; yÞ a; 8b 2 Bð14Þ

Rb

TB # ðXÞ

a

h_ a
a
i

nh_ a i

h_ a
a
i

oh_ a ih
,你
i

then; minTB
然后;最小TB

2 U;

ðx; zÞ; TB

ðz; yÞ

-a

TB

ðx; yÞ 8x; y; z

-a

Let z 2 TB
z2TB

ðyÞ; then l_ ðz; yÞ a; 8b 2 Bð20Þ

Rb

by transitivity of TB
通过TB 传递性

ð15Þ

- a

If u 2 T ðzÞ; this implies that l_ ðu; zÞ a; 8b 2 Bð21Þ
Ifu2TðzÞ;这意味着ðuzÞa8b2Bð21Þ

Rb

From (13), (14) and (15), we can conclude that
13),1415),我们可以得出结论

-a

If TB is an equivalence relation, then from (20), (21) and
如果TB等价关系,from20)、21

lRb ðx; yÞ a; 8b 2 B then, y 2 hiðxÞ, hence,TðzÞ
lRbðxyÞa8b2Bthen,y 2hðxÞhence,T ðzÞ

-a_a
一个

B

-a

transitivity of TB, we get l_ ðu; yÞ a; 8b 2 B, it implies
TB传递性我们得到lðu;yÞa;8b2B它意味着

h_ a
a
i

-a

ðxÞ, then z 2 T
ðxÞ,然后z2T

# h_ a i

ðxÞ

h_ a
a
i

h_ a
a
i

BBB
BB B

that u 2 TB
那个2个TB

ðyÞ. From (19), we get u 2 X:Since u 2 TB
ðyÞFrom19),我们得到 u2XSinceu2TB

h_ a iðxÞ _ a # h_ a iðxÞ :ð16Þ

ðzÞ and u 2
ðzÞu2

h_ a iðzÞ X; then z 2 _ a # ðXÞ:
haðzÞX thenz2a#ðXÞ:

hi
I

hihi
i

Since z 2 h_ a iðyÞ and z 2 _ a # ðXÞ. This gives that
由于z2haðyÞz2a#ðXÞ给出了

-a

Now, if z 2 TB "
现在,如果z2TB

-a

TB ðxÞ

-a

, then TB
然后TB

-a

ðzÞ\ TB

ðxÞ 6¼ u
ÐxÞ61/4u

TBTB

h_ a i_ a_a

then 9y 2 U such that y 2 h_ a iðzÞ\ h_ a iðxÞ, then y 2
然后9y2U使得y2haðzÞhaðxÞ然后y2

TB ðyÞ TB # ðXÞ;thisimpliesthaty 2 TB #
TBðyÞTB#ðXÞ;这意味着y2TB#

hihi
i

TBTB

_ a

ðy; xÞ a; 8b 2 B:

RbRb

a_a
一个

_ a

-a h_ a
一个HA
i

h_ a
a
i

hence; TB # ðXÞ TB # TB # ðXÞ
因此TB#ðXÞTB#TB# ðXÞ

ð22Þ

therefore; TB "TB ðxÞ TB ðxÞ:ð17Þ
因此TBTBðxÞTBðxÞ:ð17Þ

Hence, from (18) and (22), we get the required result
因此,from18 22),我们得到所需的结果

-a

Hence, from (12), (16) and (17), we get the required result.h
因此,from12),16and17),我们得到所需的结果。h

ðiiÞ Since;X TB" ðXÞ; then;replacing X by
ðiiÞ自;XTBðXÞ;然后;X替换为

a

TB " ðXÞ; we get
TBðXþ;我们得到

-aC
一个C

h_ a i C_ a_ a

-a _ a
一个

ðfxgÞ ¼ h_ a iðxÞ
ðxgÞ1/4haðxÞ

a _ a
一个

h_ a
a
i

Proof
证明

aC
一个C

h_ a iC_ a

h_ a i_ a

h_ a
a
i

h_ a
a
i

h_ a i C

h_ a i_ a

TB ðzÞ 6 fxg , z 62 TB

a

ðxÞ, z 2TB

h_ a
a
i

ðxÞ:

then z 2 TB
然后z 2TB

a; 8b 2
一个8b2

ðyÞ and z 2 TB " ðXÞ this gives that l_ ðz; yÞ
ðyÞz2TBðXÞ这样得到lðzyÞ

h_ a iðzÞ\ X u; this implies that 9u
haðzÞX61/4u这意味着9u
,

h_ a
a
i

h_ a
a
i

h_ a
a
i

B and u 2 X:Since,l_ ðz; yÞ a; 8b 2B and l
Bu2X自,lðzyÞa8b2Bl
_

RbRb

ðu; zÞ a; 8b 2

-a

B; hence, using transitivity of TB, we can
;因此,使用TB传递性,我们可以

h_ a
a
i

of dependency approach using above-defined lower approximation.
依赖方法使用上述定义的较低近似值。

Since, u 2 h_ a iðyÞ and u 2
Since,u 2haðyÞu2

\X u, this implies that
\X61/4u这意味着

-a

X, this provides that TB ðyÞ
X提供了TBðyÞ

Degree of dependency-based attribute selection
基于依赖关系的程度属性选择

-a

y 2" ðXÞ
y2 英寸 X英寸

a " _ a " ðXÞ _ a " ðXÞð24Þ
aaðXÞaðXÞð24Þ

From (23) and (24), we get the result.h
From23 24),我们得到结果。h

Theorem 4.10
定理4.10

a_a _ a_a
一个AAA

set of decision attributes D over set of conditional attributes
决策属性D条件属性

B is defined as:
B定义为

POSa ðDÞ ¼ [ ð_ a # XÞð25Þ
POSaðDÞ1/4[ða#XÞð25Þ

ðiiÞ _ a # ðXÞ _ a # _ a " ðXÞ _ a " ðXÞ
ðiiÞa#ðXÞa#aðXÞaðXÞ

where U=D = collection of classes having objects with
其中U=D=的集合这些类对象的

TB

Proof
证明

TBTBTB
TBTBT B 公司

same decision values.
相同的决策值。

Theorem 5.1 Let ðU; C [ D; V ; hÞ be a set-valued deci- sion system and X U;a 2 ð0; 1Þ: If B1 B2 C; then POSa ðDÞ POSa ðDÞ:
定理5.1ðUCDVhÞ 集值定系统,XUa2ð01Þ:IfB1B2C POSaðDÞPOSaðDþ:

-aB
一个 B
1

(i)Since,X TB " ðXÞ,then,replacingXby
(i)由于,XTBðXÞ那么,X替换为 X

B2
2

-a_a
一个

-a

TB # ðXÞ, we get
TB#ðXÞ我们得到

ProofIf B1 B2; we have TB
证明 B1B2 是否患有结核病
1

# X TB2

a

# X; as proved
#X;事实证明

a

-a_a
一个

_ a

in Theorem 4.3(i), so thatS ð
在定理 4.3(i) 中,因此 S ð
_

# XÞS ð_#
#XS - S#

-a_a
一个

-a_a
一个

TB ðyÞ\ TB # ðXÞ u
TB ðyÞ\ TB # ðXÞ 61/4 u
:

It implies that 9z 2 U, such that
这意味着9z2U使得

XÞ: Therefore,POSa ðDÞ POSa ðDÞ:h
XÞ:因此,POSaðDÞPOSaðDÞ:h

Theorem 5.2 Let ðU; C [ D; V ; hÞ be a set-valued deci- sion system and X U;a 2 ð0; 1Þ: If a1 a2; then
定理5.2ðUCDVhÞ集值系统,且XUa2ð01Þ:如果a1a2

POSa1 ðDÞ POSa2 ðDÞ:
POSaðDÞPOSaðDÞ:

h_ a i_ ah_ a iBB

h_ a
a
i

ProofIf a1 a2; we have TB # X TB # X; as proved in
证明如果a1a2我们有TB# XTB# X,如

h_ a
a
i

B

X2U=D

B

X2U=D

provides thatBB
规定BB

-a_a
一个

TB ðyÞ\ X u; then y 2 TB " ðXÞ: Hence
TBðyÞ\X61/4utheny2TBðXÞ:因此
;

Now, using the definition of positive region, we
现在,使用区域的定义我们

compute degree of dependency of decision attribute D
计算决策属性D 依赖程度

-a

TB

(ii)
(二)

a

TB # ðXÞ

a

a

TB

" ðXÞ:
ðXÞ:

over set of conditional attributes B as:
over条件属性B为:

POSa D

aCBðDÞ ¼ B
aCBðDÞ1/4 B

ð26Þ

Since, TB # ðXÞ X, then replacing X by TB " ðXÞ
自从,TB # ðXÞ Xthen 将 X 替换为 TB “ ðXÞ
,

we can conclude that
我们可以得出结论

jUj

-a_a_a
一个AA

TB # TB " ðXÞ TB " ðXÞ:
TB#TBðXÞTBðXÞ:

-a

Now,lety 2 TB # ðXÞ; this implies that
现在,y2TB#ðXÞ;这意味着

h_ a
a
i

where j:j = cardinality of a set and CBðDÞ 2 ½0; 1
其中 j:j = 集合的基数和 CBðDÞ 2 1/20;1

Theorem 5.3 (Monotonicity of CBðDÞ) Suppose that B AT;fcg be an arbitrary conditional attribute that belong to the dataset and D be the set of decision attributes
定理5.3CBðDÞ 单调性假设BATcg属于数据集任意条件属性D决策属性

-a

Since, X" ðXÞ
自,XðXÞ

h_ a iðyÞ _ a "
haðyÞa

and a 2 ð0; 1Þ, then CB[fcgðDÞ CBðDÞ 

a _ a
一个

an 

o _ a 

-a_a
一个

_ a

u ; u jl u ; u a ) _ a _ a 

Hence, TB # ðXÞ TB 

# TB 

" ðXÞ :h
ðXÞh

i j_i j RB[fcg

-a_a
一个

TB[fcg

-a

TB: Therefore,
TB所以,

a

In next section, we propose an attribute selection
一节中,我们提出了一个属性选择

½TB[fcg ðuiÞ½TB ðuiÞ ) TB[fcg # X TB # XSince,
1/2TBc gðuÞ1/2TBðuÞTBcg#XTB#X自那时起,

method for set-valued information system based on degree
一种基于的集合值信息系统方法

POSBðDÞ ¼
POSBðDÞ1/4

X2U=D

a

ðTb # XÞh

By Eq. (2), POSB[fcgðDÞ POSBðDÞ
方程式(2),POSBcgðDþPOSB ðDÞ

jPOSBðDÞj

tional attributes. It selects those conditional attributes, which provide a maximum increase in the degree of
tional属性。选择那些条件属性,这些属性提供

Now; since CBðDÞ ¼
现在;CBðDÞ1/4

CB[fcgðDÞ CBðDÞ:
CBcgðDÞCBðDÞ:

; this implies that
;这意味着

jUj

dependency of decision attribute. The proposed algorithm is given as follows:
决策属性的依赖关系。所提算法如下:

A subset B of the conditional attribute set C is said to be a reduct of SVDS if
如果满足以下条件,则条件属性C 子集B 是 SVDS 的还原

The main advantage of proposed algorithm is that it produces a close-to-minimal reduct set of a decision system without thoroughly checking all possible subsets of con-
所提出的算法的主要优点是,它产生了一个接近最小的决策系统归纳,而无需彻底检查所有可能的子集。

CBðDÞ ¼ CCðDÞ
CBðDÞ1/4CCðDÞ

CB fbi gðDÞ\CBðDÞ; 8bi 2 B
CBb gðDÞ\CBðDÞ8b2B

ð27Þ

ditional attributes.
ditional属性。

Now, we apply above proposed algorithm on some example datasets to demonstrate our approach.
现在,我们将上述提出的算法应用于一些示例数据集上,以演示我们的方法。

The selection of attributes in reduct set is achieved by comparing the degree of dependencies of decision attribute over sets of conditional attributes. Attributes are selected one by one until the reduct set provides the same quality of classifications as the original set.
通过比较决策属性与条件属性集依赖程度,可以实现还原集中的属性选择。逐个选择属性,直到缩减集提供与原始集相同的分类质量。

An algorithm for tolerance rough set- based attribute selection of set-valued data
用于设置数据的基于公差粗略集的属性选择的 n 算法

Illustrative examples
说明性示例

Example 7.1 Consider a set-valued decision system as given in Table 2. A fuzzy tolerance relation between objects ui; uj 2 U, calculated by using Eq. (3), is given in Table 5
例 7.1 考虑表 2 中给出的设定值决策系统。对象之间的模糊容差关系 u;u2 U,使用方程 (3) 计算,如表 5 所示
.

We calculate the degree of dependency of decision attribute d over conditional attribute c1 as follows, taking a ¼ 0:70;
我们按如下方式计算决策属性 d对条件属性 c1 的依赖程度,取 1/4 070;

Tolerance Classes:
公差等级:

In this section, a quick reduct algorithm for attribute_ a_ a
本节中,属性aa 的快速归约算法

selection of set-valued information system is presented by using degree of dependency method based on tolerance
使用基于容差的依赖度表示集合值信息系统的选择

½Tc1 ðu1Þ ¼ fu1; u3; u4g; ½Tc1 ðu2Þ ¼ fu2g
1/2Tcðu1Þ 1/4 u1u3u4g;1/2 Tcðu2Þ 1/4 u2g
;

-a

½Tc1 ðu3Þ ¼ fu1; u3; u4g;
1/2Tcðu 3Þ1/4u1u3u4g;

relation. Initially, the proposed algorithm starts with an_ a_a
关系。最初,所提出的算法aa 开头

empty set and add attributes one by one to calculate degree of dependencies of decision attribute over a set of condi-
empty set 和 addattributes 来计算决策属性一组条件依赖程度

½Tc1 ðu4Þ ¼ fu1; u3; u4g; ½Tc1 ðu5Þ ¼ fu5g
1/2Tcðu 4Þ1/4u1u3u4g;1/2Tcðu5Þ1/4u5g

Table 5 Fuzzy tolerance relation
5模糊容差关系

l_

Rc1

ui; u
uu u
jl_

Rc2

ui; u
uu u
jl_

Rc3

ui; u
uu u
jl_

Rc4

ui; u
uu u
j

1

0.67

1

0.86

0.67

1

0

0.5

0.67

0.8

1

0.67

1

0.5

0.5

1

0.5

0.9

0.8

0.4

0.67

1

0.67

0.8

0.5

0

1

0.5

0.67

0.4

0.67

1

0.67

0.67

0.67

0.9

1

0.6

0.7

0.5

1

0.67

1

0.86

0.67

0.5

0.5

1

0.67

0.8

1

0.67

1

0.5

0.5

0.5

0.6

1

0.3

0.9

0.86

0.8

0.86

1

0.8

0.67

0.67

0.67

1

0.86

0.5

0.67

0.5

1

1

0.8

0.7

0.3

1

0.2

0.67

0.5

0.67

0.8

1

0.8

0.4

0.8

0.87

1

0.5

0.67

0.5

1

1

0.4

0.5

0.9

0.2

1

U/d = {d1, d2}

d1= {u1, u2, u4}, d2= {u3, u5}

Lower approximation of U=d is calculated as:
U=d限近似计算如下:

-a_a
一个

Tc1 # d1 ¼ fu2g; Tc1 # d2 ¼ fu5g
Tc#d11/4u2Tc#d21/4u5

So, positive region of d over c1 is calculated as:
因此,dc1 上的区域计算为:

Since degree of dependency cannot exceed 1, fc2; c4g will be the reduct set of set-valued decision system as given in Table 2
由于依赖度不能超过 1, fc2;c4g 将是2 中给出设定值决策系统的还原
.

Applying the method of Dai et al. (2013) on this example set-valued dataset, we get the same reduct set fc2; c4g; but, when we change the value of parameter a
将 Dai et al. (2013) 的方法应用于此示例集值数据集,我们得到相同的还原fc2;c4克;但是,当我们更改参数A 的值

from 0.7 to 0.9, our approach gives fc g as reduct set.
0.70.9,我们的方法给出FcG作为还原集。

POS
POS机

c1 ðdÞ ¼
cðdÞ1/4

X2[U=d

a

ðTb #

a

XÞ ¼ ðTb #
XÞ1/4ÐTb#

a

d1Þ[ ðTb #

d2Þ

2

Therefore, proposed approach gives the facility to get the best minimal reduct for a set-valued decision system.
因此,所提出的方法为设定值决策系统提供了获得最佳最小还原的设施。

¼ fu2g [ fu5g ¼ fu2; u5g
1/4fu2g[fu5g1/4fu2;5

Now, degree of dependency of d over c1 is calculated as:
现在,dc1依赖计算如下:

Example 7.2 After converting an incomplete decision system as given in Table 3 into a set-valued decision sys-
7.23中给出不完整决策系统转换为设定值决策系统

Cfc1 gðdÞ ¼
CcgðdÞ1/4

jPOSc1 ðdÞj jUj

2

¼ 5 ¼ 0:4
1/451/404

tem by replacing missing attribute values to set of all possible attribute values for any object, we get Table 4. Again, similar to Example 7.1, the fuzzy tolerance relation
tem 通过将缺失的属性值替换为任何对象的所有可能属性值的集合,我们得到表 4。同样,7.1 类似模糊容差关系

Similarly, we can calculate degree of dependency of decision attribute with respect to other conditional attributes,
同样,我们可以计算 decision 属性相对于其他条件属性的依赖程度,

between objects ui; uj 2 U is given in Table 6
对象之间u;u2U6给出
.

Taking,a ¼ 0:4;
拍摄,a1/404;

d1 ¼ f u1; u2; u5; u6g; d2 ¼ f u3; u4g
d11/4 fu1;u2;u5;U6克;d21/4fu3;4

31

Cfc2 gðdÞ ¼ 5 ¼ 0:6;Cfc3 gðdÞ ¼ 5 ¼ 0:2
CcgðdÞ1/451/406Cc gðdÞ1/451/402
;

3

Calculating degree of dependency of decision attribute
计算决策属性依赖程度

D over conditional attribute c1
D超过条件属性c1
,

Cfc4 gðdÞ ¼ 5 ¼ 0:6
CcgðdÞ1/451/406
;

since, Cfc3 gðdÞ\Cfc1 gðdÞ\Cfc2 gðdÞ ¼ Cfc4 gðdÞ
自从,Cc gðdÞ\CcgðdÞ\CcgðdÞ1/4CcgðdÞ

Either c2 or c4 will be the member of reduct set.
c2c4将是reductset 的成员

a_a
一个

½Tc1 ðu1Þ ¼ fu1; u3; u4; u5g; ½Tc1 ðu2Þ ¼ fu2; u3; u5; u6g; 

a

½Tc1 ðu3Þ ¼ fu1; u2; u3; u4; u5; u6g; 

-a_a
一个

Suppose,c2 is first reduct member. We will add other
假设 c2第一个reduct成员。我们将添加其他

attributes to c2 one by one and calculate corresponding degree of dependencies by using Eq. (8), we get
属性逐一传递给c2,并使用方程 (8计算相应的依赖程度,我们得到

½Tc1 ðu4Þ ¼ fu1; u3; u4; u5g; ½Tc1 ðu5Þ ¼ fu1; u2; u3; u4; u5; u6g;
1/2Tcðu 4Þ1/4u1u3u4u5g;1/2Tcðu5Þ1/4u1u2u 3u4u5u6g;

a

½Tc1 ðu6Þ ¼ fu2; u3; u5; u6g
1/2Tcðu 6Þ1/4u2u3u5u6g

-a_a_a_a
一个AAA

Tfc1 ;c2 gðu1Þ ¼ fu1g; Tfc1 ;c2 gðu2Þ ¼ fu2g;Now, Tc1 # d1 ¼ /; Tc1 # d2 ¼ /
Tccgðu1Þ1/4u1gTccgðu2Þ1/4u2g现在,Tc#d11/4/Tc#d21/4/

auu u
auuu u
;

1 2

a_a
一个

Tfc1 ;c2 gðu4Þ ¼ u3;u4 ; Tfc1 ;c2 gðu5Þ ¼ fu5g
Tcc gðu4Þ1/4u3u4Tccgðu5Þ1/4u5g

a_a
一个

Tfc1 ;c2 g # d1 ¼ fu1; u2g; Tfc1 ;c2 g # d2 ¼ fu5g
Tcc g#d11/4u1u2g;Tcc g#d21/4u5

POSc ðdÞ¼/ and Cfc gðdÞ ¼ 0
POScðdÞ1/4/CcgðdÞ1/40

Similarly, for other conditional attributes,
同样,对于其他条件属性,

Cfc2 gðdÞ ¼ 0; Cfc3 gðdÞ ¼ 0; Cfc4 gðdÞ ¼ 0:67
CcgðdÞ 1/4 0CcgðdÞ 1/4 0CcgðdÞ 1/4 067
;

Since the degree of dependency is the highest for c4; c4 

-a_a
一个

POSfc1 ;c2 gðdÞ ¼ Tfc1 ;c2 g # d1 [ Tfc1 ;c2 g # d2 ¼ fu1; u2; u5g
POScc gðdÞ1/4Tccg#d1Tccg#d21/4u1u2u5g

3

Cfc1 ;c2 gðdÞ ¼ 5 ¼ 0:6
Ccc gðdÞ1/451/406

Similarly, Cfc ;c gðdÞ ¼ 3 ¼ 0:6;Cfc ;c gðdÞ ¼ 5 ¼ 1
同样,CccgðdÞ1/431/406CccgðdÞ1/451/41

will be the first member of the reduct set. Similar to
将是reduct的第一个成员类似于

Example 7.1, on adding other attributes to c4; we can calculate corresponding degree of dependencies as follows:
7.1c添加其他属性4;我们可以计算出相应的依赖如下:

Cfc1 ;c4 gðdÞ ¼ 1; Cfc2 ;c4 gðdÞ ¼ 1; Cfc3 ;c4 gðdÞ ¼ 0:67
Ccc gðdÞ1/41CccgðdÞ1/41CccgðdÞ1/4067

2 352 45

Table 6 Fuzzy tolerance relation
6模糊容差关系

l_

Rb1

ui; u
uu u
jl_

Rb1

ui; u
uu u
jl_

Rc3

ui; u
uu u
jl_

Rb4

ui; u
uu u
j

1

0

0.67

1

0.67

0

1

0

0.5

0

0.5

1

1

0.4

0.4

0

0

0

1

0

0

0

0

1

0

1

0.67

0

0.67

1

0

1

0.5

0

0.5

0

0.4

1

1

0.4

0.4

0.4

0

1

0

1

0

0

0.67

0.67

1

0.67

1

0.67

0.5

0.5

1

0.5

1

0.5

0.4

1

1

0.4

0.4

0.4

0

0

1

0

0

0

1

0

0.67

1

0.67

0

0

0

0.5

1

0.5

0

0

0.4

0.4

1

0

0

0

1

0

1

0

0

0.67

0.67

1

0.67

1

0.67

0.5

0.5

1

0.5

1

0.5

0

0.4

0.4

0

1

0

0

0

0

0

1

0

0

1

0.67

0

0.67

1

1

0

0.5

0

0.5

1

0

0.4

0.4

0

0

1

1

0

0

0

0

1

Table 7 Effect of a on reduct set
7areductset 的影响

Values of aReducts
AReducts 的值

a ¼ 0:4fc1; c4g or fc2; c4g
A1/404c1c4gc2c4g

a ¼ 0:5fc3; c4g
A1/405C3C4

a ¼ 0:6fc2; c4gor fc3; c4g
A1/406c2c4gc3c4g

a ¼ 0:7fc2; c3gorfc2; c4g
A1/407C2C3GC2C4G

Hence, fc1; c4g or fc2; c4g will be the reduct set of incomplete decision system as given in Table 3
因此,fc1;c4g fc2;c4g 将是表 3 中给出的不完整决策系统的还原集
.

Here,a is user-oriented, so we can find the best minimal reduct by changing the value of a as follows (Table 7):
这里,a是面向用户的,因此我们可以通过更改 a 的值来找到最佳的最小 reduct 如下所示(表 7):

So, expert can decide the value of a according to domain in order to find the best suitable reduct set of a decision system with missing values.
因此,专家可以根据确定a的值,以便找到具有缺失值的决策系统的最合适的归减集

Example 7.3 Let us consider a practical situation from the foreign language ability test in Shanxi University, China. Results can be inferred as a conjunctive set-valued infor- mation system. We have classified the whole test into four factors: Audition, Spoken language, Reading and Writing. The test results are given in Table 8, which can be down- loaded from (http://www.yuhuaqian.com), where U ¼ fu1; u2; u3; .. .; u49; u50g: For convenience purpose, we have used abbreviations for Audition, Spoken language, Reading and Writing as A, S, R and W, respectively.
7.3 让我们考虑中国山西大学外语能力测试的实际情况。结果可以推断为一个联合的 set-value 信息系统。我们将整个测试分为四个因素:试听、口语、阅读和写作。测试结果见表 8,可以从 http://www.yuhuaqian.com) 下载其中U 1/4 fu1;u2;u3;..;u49;u50g:为方便起见,我们使用了Audition、Spokenlanguage、Reading 和 Writing 的缩写分别为 A、S、R 和 W。

We have applied the same process as described in Example 7.1 and calculate reduct for this dataset by taking a ¼ 0:67 as follows:
我们应用了与例 7.1 中描述的相同的过程,并通过取 1/4 067 来计算此数据集的 reduct ,如下所示:

So, reduct of the decision system is either fc2; c3; c4g or
因此,决策系统的还原fc2;c3;c4g

fc1; c2; c4g.

So far, we have performed the experimental analysis for attribute selection of set-valued information system by applying proposed rough set-based approach in Example
到目前为止,我们已经通过应用示例中提出的基于粗糙集合的方法对集合值信息系统的属性选择进行了实验分析

7.1. We have compared proposed approach with an exist- ing approach to find the close-to-minimal reduct set by changing the value of the parameter a. In Example 7.2, we have dealt the problem of attribute selection in an incom- plete information system through conversion into set-val- ued information system by replacing missing attribute values for an object with the set of all possible attribute values. Also the effect of parameter a has been shown to find a minimal reduct set, which depend on users’ choice. In Example 7.3, we have successfully applied our approach in a practical situation obtained from foreign language ability test in Shanxi University, China.
7.1.我们将提出的方法与现有方法进行了比较,以通过更改参数 a 的值来找到接近最小还原集。 在例 7.2,我们通过将对象缺失的属性值替换为所有可能的属性值的集合,通过转换为集合值信息系统,解决了不完整信息系统中的属性选择问题。此外,参数 a 的效果已被证明可以找到最小 reduct 集,这取决于用户的选择。在7.3 中,我们成功地我们的方法应用于中国山西大学外语能力测试获得的实际情况。

Experimental results and analysis
实验结果和分析

To check the efficiency of the proposed attribute reduction algorithm for set-valued information systems, we perform some experiments on a PC with specifications given in Table 9. We conduct our experiments on six real datasets taken from the University of California, Irvine (UCI) Machine Learning Repository in (Blake 1998). All six real datasets are incomplete decision systems (special case of set-valued decision system) given in Table 10. For the
为了检查所提出的属性约简算法对集合值信息系统的效率,我们在 PC 上进行了一些实验,其规格如表 9 所示。我们在 (Blake 1998) 的加利福尼亚大学欧文分校 (UCI) 机器学习存储库中获取的六个真实数据集上进行了实验。所有 6 个真实数据集都是不完全决策系统(设定值决策系统的特殊情况),如表10所示

29experimental work, we use the WEKA tool with ten fold
29实验工作,我们使用WEKA工具

Cfc1 gðDÞ ¼ 0; Cfc2 gðDÞ ¼ 50 ; Cfc3 gðDÞ ¼ 50
CcgðDÞ1/40CcgðDÞ1/450CcgðDÞ1/450
;

16

Cfc4 gðDÞ ¼ 50
CcgðDÞ1/450

3243
3243

cross-validation technique (Hall et al. 2009). In the
交叉验证技术(Hallet al.2009)。

experiments, we select three attribute reduction algorithms for comparisons. There are two statistical approaches
实验中,我们选择了三种属性缩减算法进行比较有两种统计方法

31Relief-F (Robnik-Sˇikonja and Kononenko 2003) and cor-
31浮雕-F(Robnik-S ˇikonjaKononenko2003cor-

Cfc1 ;c4 gðDÞ ¼ 50 ; Cfc2 ;c4 gðDÞ ¼ 50 ; Cfc3 ;c4 gðDÞ ¼ 50
Ccc gðDÞ1/450CccgðDÞ1/450CccgðDÞ1/450
;

4848
4848

Cfc1;c2;c4gðDÞ ¼ 50 ; Cfc2;c3;c4gðDÞ ¼ 50
Cc1c2c4gðDÞ1/450Cc2c3c4gðDÞ1/450
;

48

Cfc1;c2;c3;c4gðDÞ ¼ 50
Cc1c2c3c4gðDÞ1/450
:

relation-based feature selection (Hall 1999), and one fuzzy rough set model (Dai and Tian 2013). For convenience, we denote them as Relief-F, CFS and FRSM, respectively. For calculation of classification accuracies, we use two classi- fiers, namely PART and J48. Finally, a paired t test is performed to ensure the significance of experimental
基于关系的特征选择 (Hall 1999) 和一个模糊粗糙集模型 (Dai 和 Tian 2013)。方便起见,我们分别将它们表示Relief-F、CFSFRSM为了计算分类精度,我们使用两个分类器,即 PART 和 J48。最后,进行配对 t 检验以确保实验的显著性

Table 8 A set-valued decision table obtained from foreign language ability test in Shanxi University
8山西大学外语能力测验得到的定值决策表

Table 9 The description of experiment environment
9实验环境说明

No.NamesModelParameters
不。名称模型参数

CPUIntel(R) Core(TM) i5-4210U1.70 GHz, 2 Cores
CPU Intel(R)Core(TM)i5-4210U 1.70GHz,2

MemoryDDR3 SDRAM8 GB 2401 MHz
内存 DDR3SDRAM 8GB2401MHz

Hard diskST1000LM0241 TB
硬盘ST1000LM024 1TB

SystemWindows 1064-bit
系统 Windows1064 位

PlatformPython 2.7Anaconda distribution for windows
适用于Windows 的平台 Python2.7 Anaconda发行版

Table 10 The description of datasets
10 数据集说明

No.
不。

Datasets
数据

Abbreviation
缩写

Objects
对象

Features
特征

Classes

1.

Audiology_Standardized

Audiology
听觉学

226

69

24

2.

Soyabean_Large

Soyabean
大豆

307

35

19

DermatologyDermatology366346
皮肤病学 皮肤病学366346

HepatitisHepatitis115192
肝炎肝炎115192

ZooZoo101177
动物园动物园 101177

Processed_ClevelandCleveland303145
Processed_Cleveland克利夫兰303145

results obtained from the proposed approach, where the significance level is specified to be 0.05.
从所提出的方法获得的结果,其中显着性水平指定为 0.05。

Size and the classification accuracy of the reduced fea- ture subset obtained by FSRS are statistically compared with those acquired by FRSM, Relief-F and CFS by using the paired t test. The obtained results are listed in the last columns of Tables 11, 12 and 13. In tables, ‘‘w’’ denotes the number of win, ‘‘*’’ denotes the number of tie and ‘‘l’’ denotes the number of loss achieved by the proposed FSRS approach, which is also written next to the values in Table 11, 12 and 13. Here, win means that the cardinality (or accuracy) of the feature subset obtained by FSRS is
使用配对 t 检验将 FSRS 获得的简化特征子集的大小和分类精度 FRSM、Relief-F 和 CFS 获得的子集的大小和分类精度进行统计比较。获得的结果列在表 111213 的最后一列中。在表格中,''w'' 表示获胜次数''*''表示平局次数''l'' 表示通过提议FSRS 方法实现失败次数它也 111213 中的。这里,win 表示FSRS获取特征子集基数(或准确率)

significantly fewer (or higher) than that of FRSM, Relief-F or CFS; tie means that the results obtained by FSRS have no statistical difference with that of FRSM, Relief-F or CFS; and loss means that the proposed approach is statis- tically poor than other approaches.
明显小于(或高于 FRSM、Relief-F 或 CFS);tities 意味着 FSRS 获得的结果与 FRSM、Relief-F 或 CFS 的结果没有统计差异;loss 意味着所提出的方法在统计上比其他方法差。

Reduct size
Reductsize (还原大小

After comparing proposed approach with other three approaches on chosen datasets, reduced average (avg.) feature subset size is given in Table 11. Effect of parameter a on reduct size is also shown for the proposed approach.
在所选数据集上将所提出的方法与其他三种方法进行比较后11中给出了减小的平均 (avg.) 特征子集大小。对于所提出的方法,还显示了参数 a还原大小的影响

Table 11 Comparison of feature subset size (Avg. subset size)
表 11特征子集大小 (Avg. 子集大小)

Table 12 Comparison of classification accuracies (rules-PART)
12分类精度比较(规则-PART)

Datasets
数据

Original
源语言

FSRS

FRSM

Relief-F
浮雕-F

CFS

Paired t test (w/*/l)
配对t检验(w/*/l)

a = 0.1

a = 0.3

a = 0.5

a = 0.7

Audiology
听觉学

78.31

77.87

78.31

75.66

75.66

76.12*

80.08*

77.43*

(0/3/0)

Soyabean
大豆

91.94

0

0

83.74

88.43

87.70*

87.99 l

85.06*

(0/2/1)

Dermatology
皮肤病学

94.53

91.26

91.26

91.26

91.26

90.25w
90.25 瓦

91.53w
91.53 瓦

90.71w
90.71 瓦

(3/0/0)

Hepatitis
肝炎

67.74

83.22

83.22

83.22

83.22

76.13w
76.13 瓦

84.51*

81.93*

(1/2/0)

Zoo
动物园

92.07

95.04

95.04

95.04

95.04

85.14w
85.14 瓦

95.04*

95.04*

(1/2/0)

Cleveland
克利夫兰

53.79

54.12

54.12

54.12

54.12

49.17w
49.17 瓦

57.75 l

54.78*

(1/1/1)

Table 13 Comparison of classification accuracies (trees-J48)
13分类精度比较(树-J48)

Datasets
数据

Original
源语言

FSRS

FRSM

Relief-F
浮雕-F

CFS

Paired t test(w/*/l)
配对t检验(w/*/l)

a = 0.1

a = 0.3

a = 0.5

a = 0.7

Audiology
听觉学

77.87

78.31

78.76

77.87

77.87

76.12*

78.76*

77.87*

(0/3/0)

Soyabean
大豆

91.50

0

0

86.23

87.99

87.99*

87.84 l

85.36*

(0/2/1)

Dermatology
皮肤病学

93.98

92.62

92.62

92.62

92.62

88.28*

90.44*

87.70*

(0/3/0)

Hepatitis
肝炎

58.06

83.22

83.22

83.22

83.22

78.06w
78.06 瓦

84.51*

81.29*

(1/2/0)

Zoo
动物园

92.07

94.05

94.05

94.05

94.05

89.10w
89.10 瓦

95.04*

95.04*

(1/2/0)

Cleveland
克利夫兰

52.14

54.12

54.12

54.12

54.12

54.12*

56.10*

54.12*

(0/3/0)

Results obtained from Table 11 indicate that all four fea- ture selection algorithms exclude most of the features available in unreduced datasets. But it can be observed that the proposed approach provides more reduced or equal reduct size as compared to other three approaches. As for the hepatitis dataset having 19 attributes, proposed approach (FSRS) selects 3 (nearest integer is taken) attri- butes while FRSM, Relief-F and CFS select 6, 5 and 10 attributes, respectively. It shows that FSRS has a redun- dancy-removing capacity, while other algorithms do not completely eradicate the redundant features from the selected feature subset.
从表 11 获得的结果表明,所有四种特征选择算法都排除了未简化数据集中可用的大多数特征可以观察到,与其他三种方法相比,所提出的方法提供了更多更小或相等的还原大小。对于具有 19 个属性的肝炎数据集,建议的方法 (FSRS) 选择 3 个(取最接近的整数)属性,而 FRSM、Relief-F 和 CFS 分别选择 6、5 和 10 个属性。它表明 FSRS 具有冗余去除能力,而其他算法并不能完全消除所选特征子集中的冗余特征。

Effect of parameter a The parameter a need to be set individually according to different datasets because of their different correlation strengths. In Table 11, selected feature subset for soyabean dataset is varying with the change in parameter a. At a = 0 and a = 0.3, FSRS does not provide
参数 a 的影响 由于 它们的相关性强度不同,需要根据不同的数据集单独设置11 中,大豆数据集的选定特征子集随参数 a 的变化而变化a=0a=0.3 时,FSRS不提供

any reduct elements, but as we increase the threshold parameter a, FSRS outperforms other approaches. For example, at a = 0.5 and a = 0.7, FSRS select 9 and 12 attributes, respectively, while FRSM, Relief-F and CFS select 16, 17 and 17 attributes, respectively. Also, for audiology data, subset size is affected by parametric value a as at a = 0.1, a = 0.3 and a = 0.5, FRSR gives reduct sizes 12, 12 and 10, respectively, but at a = 0.7, reduct size is same as at a = 0.5. Remaining 4 datasets show no variation for the value of parameter a as they provide the same number of selected attributes for all chosen values of a
任何还原元素,但随着阈值参数 a 的增加,FSRS 的性能优于其他方法。例如,在 a = 0.5 和 a = 0.7 时,FSRS 分别选择 9 和 12 个属性,而 FRSM、Relief-F 和 CFS 分别选择 16、17 和 17 个属性。此外,对于听力学数据,子集大小受参数值 a 的影响,如 a = 0.1、a = 0.3 和 a = 0.5,FRSR 分别给出还原大小 12、12 和 10,但在 a = 0.7 时,还原大小与 a = 0.5 时相同。其余 4 个数据集显示参数 a 的值没有变化,因为它们为
.

Statistical analysis It can be seen clearly from the results of t test which is applied between reduct sizes of FSRS (at a = 0.5), FRSM, Relief-F and CFS algorithms (presented in Table 11) that, for almost all the datasets, FSRS out- performs the other three reduction algorithms in terms of
统计分析 FSRS(a = 0.5)、FRSM、Relief-F 和 CFS 算法(如11 所示)的还原大小之间应用的 t 检验结果可以清楚地看出,对于几乎所有数据集,FSRS 的性能都优于其他三个缩减算法

Fig. 1 Variation of reduced feature subset sizes with four algorithms
图 14 种算法下缩减特征子集大小的变化

Fig. 2 Variation of classification accuracies with classifier PART for four algorithms
图 2 四种算法使用分类器 PART 的分类精度变化

cardinality of the feature subset. In Table 11, FSRS achieves significantly fewer features for all the datasets except the dataset Cleveland and zoo. For Clevland dataset, FSRS is significantly equivalent to Relief-F and CFS algorithms in terms of subset size. In summary, out of total 18 paired t test performance results it gets 15 wins, 3 ties and 0 loss.
特征子集的基数。在表 11 中,FSRS 在除数据集Clevelandzoo 之外的所有数据集上实现的特征都明显较少对于Clevland数据集,FSRS 在子集大小方面与 Relief-F 和 CFS 算法明显等效。总之,在总共 18 t 测试性能结果中,它得到 15 胜、3 平 0 负。

Classification accuracy
分类准确性

Comparison of classification accuracies for classifiers PART and J48 is presented in Tables 12 and 13, respec- tively. The classification accuracies are presented in per- centage. In Table 12, for soyabean, dermatology and zoo datasets, classification accuracies evaluated by FSRS algorithm are higher or equal as compared to rest three algorithms while for other datasets, it shows mixed beha- viour. For example, classification accuracy for Cleveland
1213 分别列出了分类器 PART 和 J48 的分类精度比较。分类精度以百分比表示。在表 12,对于大豆、皮肤病学和动物园数据集,FSRS 算法评估的分类准确性高于或等于其余三种算法,而对于其他数据集,则表现出混合行为。例如Cleveland 分类准确性

dataset is 54.12 in case of FSRS approach and 49.17, 57.75 and 54.78 in case of FRSM, Relief-F and CFS approaches, respectively. Similarly, for hepatitis dataset, proposed algorithm provides better classification accuracy as FRSM and CFS, but less than the Relief-F algorithm. In Table 13, we can find similar kind of results as in Table 12
FSRS方法的情况下,数据集分别为54.12,在 FRSM、Relief-F 和 CFS 方法的情况下,数据集分别为 49.17、57.75 和 54.78。同样,对于肝炎数据集,所提出的算法提供了更好的分类精度,如 FRSM 和 CFS,但低于 Relief-F 算法。在表 13,我们可以找到与表 12 中类似的结果
.

Effect of parameter a As we change the value of a, cor- responding classification accuracies are also changing for datasets soyabean and audiology but not for other datasets. In Table 12, for a = 0 and a = 0.3, classification accuracies in case of soyabean dataset are 0 but for a = 0.5 and a = 0.7, classification accuracies are 83.74 and 88.43, respectively. So, by changing the value of parameter a, we can get better classification accuracies than rest three approaches. There is no effect of parameter on rest four datasets.
参数 a 的影响 当我们改变 a 的值时数据集大豆听力学的相应分类准确性也在变化其他数据集则没有变化。12,对于 a=0 和 a=0.3,大豆数据集的分类准确率为 0,但当 a= 0.5 和 a= 0.7 时,分类准确率分别为 83.7488.43因此,通过更改参数 a 的值,我们可以获得其余三种方法更好的分类精度parameter 对其余 4 个数据集没有影响。

Statistical analysis Paired t test is applied between clas- sification accuracies of FSRS (at a = 0.5), FRSM, Relief-F and CFS approaches, and results are shown in the last column
统计分析 FSRS (a= 0.5)、FRSM、Relief-F CFS方法的分类精度之间应用配对 t 检验,结果显示在最后一

Fig. 3 Variation of classification accuracies with classifier J48 for four algorithms
图 3四种算法使用分类器 J48的分类精度变化

of Tables 12 and 13. In Table 12, FSRS achieves signifi- cantly higher or equivalent accuracy for all datasets except the datasets soyabean and Cleveland where it loses to Relief- F approach. In summary, out of total 18 paired t test per- formance results FRSR approach gets 6 wins, 10 ties and 2 losses. Similarly, in Table 13, out of total 18 paired t test performance results it gets 2 wins, 16 ties and 1 loss. Therefore, the proposed FSRS approach is effective and way better than other approaches in terms of both acquiring few features and achieving high classification accuracy.
1213。在表 12,FSRS 在所有数据集上都实现了明显更高或相当的准确性,但数据集 soyabeanCleveland 除外给了Relief- F 方法。总之,在总共 18 个配对 t 检验性能结果中,FRSR 方法获得 6 胜、10 平和 2 负。同样,在表 13,在总共 18 个配对 t 测试性能结果中,它得到 2 胜、16 平和 1 负。 因此,所提出的FSRS方法获取较少特征和实现高分类精度方面有效的,并且比其他方法更好

More detailed change trendline of each approach on the six datasets is displayed in Figs. 1, 2 and 3. Figure 1 rep- resents a comparison of reduced average feature subset size for all four algorithms on six datasets. It can be observed that the proposed FSRS algorithm selects least number of features as a member of reduct set. Figures 2 and 3 display more detailed change trend of the algorithms in classifi- cation accuracy with the number of selected attributes on all chosen dataset. It is obvious from figures that the pro- posed approach provides either higher or nearly equal classification accuracy for all six datasets.
123。图 1反映了6 个数据集上所有四种算法的平均特征子集大小的减少比较。可以观察到,所提出的 FSRS 算法选择的特征数量最少作为 reduct的成员。图 23显示了算法在分类准确性方面更详细的变化趋势,以及所有选定数据集上所选属性的数量。从数字中可以明显看出,所提出的方法为所有六个数据集提供了更高或几乎相等的分类精度。

After summarizing the comparison tables and graphs above, we can finally conclude that the proposed FSRS algorithm is an acceptable choice to select the best feature subsets in set-valued decision systems.
在总结了上面的比较表和图表之后,我们最终可以得出结论,所提出的 FSRS 算法是在设定值决策系统中选择最佳特征子集的可接受选择。

Conclusion and future work
结论未来工作

In this paper, we have defined a tolerance relation for set- valued decision systems and given a novel approach for attribute selection based on the rough set concept using a similarity threshold. Lower and upper approximations have been defined by using fuzzy tolerance relation and
在本文中,我们定义了一个集合值决策系统的容忍关系,并给出了一种基于粗略集概念的新方法,使用相似性阈值进行属性选择下限上限近似值已使用模糊容差关系

presented a method to calculate degree of dependency of decision attribute over a subset of conditional attributes. Some important results on lower and upper approxima- tions, positive regions and the degree of dependencies have been validated using our approach. Moreover, we have presented an algorithm along with some illustrative examples for better understanding of the proposed approach. In Example 7.2, we have applied our method to an incomplete information system, in which some attribute values were missing. Effect of parameter a on reduct set of set-valued decision systems has been shown. We have compared the proposed approach with three existing approaches on a six real benchmark datasets and observed that our model is able to find the minimal reduct with higher accuracy. We have also ensured that the proposed approach is statistically more significant in comparison with the other approaches by using paired t test technique. In the future, we will investigate some robust models for set-valued information system to avoid misclassification and noise. Set-valued information systems with missing decision values will be taken into consideration from the viewpoint of updating the process of knowledge discovery. We intend to find some generalizations of fuzzy rough set-
提出了一种方法来计算决策属性条件属性子集依赖程度使用我们的方法验证关于下部部近似、区域依赖程度的一些重要结果此外,我们还提出了一种算法以及一些说明性示例,以便更好地理解所提出的方法。在例 7.2,我们将方法应用于一个不完整的信息系统,其中缺少一些属性值。已经显示了参数 a对集合值决策系统的还原集的影响。我们在六个真实的基准数据集上所提出的方法三种现有方法进行了比较,并观察到我们的模型能够更高的准确性找到最小还原。我们还通过使用配对 t 检验技术确保所提出的方法与其他方法相比统计学上显着。 未来,我们将研究一些用于集合值信息系统的鲁棒模型,以避免误分类和噪声。更新知识发现过程的角度来看考虑缺少决策值的集合值信息系统。我们打算找到一些模糊粗略泛化-

based attribute selection for set-valued decision systems.
基于设置值决策系统的属性选择

Compliance with ethical standards
遵守道德标准

Conflict of interest The authors declare that they have no conflict of interest.
利益冲突作者声明他们没有 利益冲突。

Research involving human participants and/or animals This article does not contain any studies with human participants or animals performed by any of the authors.
涉及人类参与者和/或动物的研究 本文不包含任何作者对人类参与者或动物进行的任何研究。

References
引用

Blake CL (1998) UCI Repository of machine learning databases, Irvine, University of California. http://www.ics.uci.edu/
BlakeCL(1998)UCI机器学习数据库存储库,尔湾,加利福尼亚大学http://www.ics.uci.edu/

*mlearn/MLRepository.html. Accessed 1 Feb 2019
*mlearn/MLRepository.htm访问日期2019 年 2 月1

Dai J (2013) Rough set approach to incomplete numerical data. Inf
DaiJ(2013)不完全数值数据的粗糙方法.Inf

Sci 241:43–57
科学241:43-57

Dai J, Tian H (2013) Fuzzy rough set model for set-valued data.
DaiJ,Tian H(2013)用于集值数据的模糊粗糙模型.

Fuzzy Sets Syst 229:54–68
模糊系统229:54-68

Dai J, Xu Q (2012) Approximations and uncertainty measures in incomplete information systems. Inf Sci 198:62–80
Dai J, Xu Q (2012) 不完整信息系统中的近似值和不确定性测量 。国际科学 198:62-80

Dai J, Wang W, Tian H, Liu L (2013) Attribute selection based on a new conditional entropy for incomplete decision systems. Knowl-Based Syst 39:207–213
Dai J, Wang W, Tian H, Liu L (2013) 基于不完整决策系统的新条件熵的属性选择基于知识的系统 39:207-213

Dubois D, Prade H (1992) Putting rough sets and fuzzy sets together. In: Słowin´ski R (ed) Intelligent decision support. Springer, Dordrecht, pp 203–232
Dubois D, Prade H (1992) 将粗糙集和模糊集放在一起。 In: Słowinski R (ed) Intelligent decision support. 施普林格,多德雷赫特,第 203-232 页

Guan YY, Wang HK (2006) Set-valued information systems. Inf Sci 176(17):2507–2525
Guan YY, Wang HK (2006) 集值信息系统。国际科学 176(17):2507–2525

Hall M (1999) Correlation-based feature selection for machine learning. PhD Thesis, Department of Computer Science, Waikato University, New Zealand
Hall M (1999) 用于机器学习的基于相关性的特征选择 博士论文,新西兰怀卡托大学计算机科学系

Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) WEKA 数据挖掘软件:更新。ACM SIGKDD 探索新闻 11(1):10-18

He Y, Naughton JF (2009) Anonymization of set-valued data via top- down, local generalization. Proc VLDB Endow 2(1):934–945 

Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594 

Huang SY (ed) (1992) Intelligent decision support: handbook of applications and advances of the rough sets theory, vol 11. Springer, Berlin 

Jensen R, Shen Q (2009) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838 

Jensen R, Cornelis C, Shen Q. (2009) Hybrid fuzzy-rough rule induction and feature selection. In: FUZZ-IEEE 2009, IEEE international conference on fuzzy systems, 2009. IEEE, 

pp. 1151–1156 

Kryszkiewicz M (1998) Rough set approach to incomplete informa- tion systems. Inf Sci 112(1–4):39–49 

Kryszkiewicz M (1999) Rules in incomplete information systems. Inf Sci 113(3–4):271–292 

Lang G, Li Q, Yang T (2014) An incremental approach to attribute reduction of dynamic set-valued information systems. Int J Mach Learn Cybern 5(5):775–788
Lang G, Li Q, Yang T (2014) 动态集值信息系统属性缩减的增量方法。国际赫学习网络杂志 5(5):775–788

Leung Y, Li D (2003) Maximal consistent block technique for rule acquisition in incomplete information systems. Inf Sci 153:85–106
Leung Y, Li D (2003) 用于不完整信息系统中规则获取的最大一致块技术。国际科学 153:85-106

Lipski W Jr (1979) On semantic issues connected with incomplete information databases. ACM Trans Database Syst (TODS) 4(3):262–296
Lipski W Jr (1979) 关于与不完整信息数据库相关的语义问题。ACM Trans 数据库系统 (TODS) 4(3):262–296

Lipski W Jr (1981) On databases with incomplete information.
LipskiWJr(1981)关于信息不完整数据库

J ACM (JACM) 28(1):41–70
美国医学杂志(JACM)28(1):41-70

Luo C, Li T, Chen H, Liu D (2013) Incremental approaches for updating approximations in set-valued ordered information systems. Knowl-Based Syst 50:218–233
Luo C, Li T, Chen H, Liu D (2013) 更新集合值有序信息系统中近似值的增量方法。基于知识的系统 50:218-233

Luo C, Li T, Chen H (2014) Dynamic maintenance of approximations in set-valued ordered decision systems under the attribute generalization. Inf Sci 257:210–228
Luo C,Li T,Chen H(2014) 属性泛化下集合值有序决策系统中近似的动态维护.国际科学 257:210-228

Luo C, Li T, Chen H, Lu L (2015) Fast algorithms for computing rough approximations in set-valued decision systems while updating criteria values. Inf Sci 299:221–242
Luo C, Li T, Chen H, Lu L (2015) 在更新标准值时计算设定值决策系统中粗略近似的快速算法。国际科学 299:221–242

Orłowska E (1985) Logic of nondeterministic information. Stud Logica 44(1):91–100
Orłowska E (1985) 非确定性信息的逻辑. 螺柱 44(1):91-100

Orłowska E, Pawlak Z (1984) Representation of nondeterministic information. Theor Comput Sci 29(1–2):27–39
Orłowska E, Pawlak Z (1984) 非确定性信息的表示。理论计算科学 29(1-2):27-39

Pawlak Z (1991) Rough Sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht
Pawlak Z (1991) 粗糙集:关于数据推理的理论方面。Kluwer Academic Publishers, 多德雷赫特

Pawlak Z, Skowron A (2007a) Rough sets and Boolean reasoning. Inf Sci 177(1):41–73
Pawlak Z,Skowron A(2007a)粗糙集和布尔推理。国际科学177(1):41-73

Pawlak Z, Skowron A (2007b) Rough sets: some extensions. Inf Sci 177(1):28–40
Pawlak Z, Skowron A (2007b) 粗糙的集:一些扩展。国际科学 177(1):28-40

Pawlak Z, Skowron A (2007c) Rudiments of rough sets. Inf Sci 177(1):3–27
Pawlak Z, Skowron A (2007c) 粗糙集的雏形。国际科学177(1):3-27

Qian Y, Dang C, Liang J, Tang D (2009) Set-valued ordered information systems. Inf Sci 179(16):2809–2832
Qian Y, Dang C, Liang J, Tang D (2009) 集值有序信息系统。国际科学 179(16):2809–2832

Qian Y, Liang J, Pedrycz W, Dang C (2010a) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9–10):597–618
QianY,Liang J,PedryczW,Dang C(2010a)近似:粗糙集理论中属性约简的加速器。 Artif Intell 174(9-10):597-618

Qian YH, Liang JY, Song P, Dang CY (2010b) On dominance relations in disjunctive set-valued ordered information systems. Int J Inf Technol Decis Mak 9(01):9–33
Qian YH, Liang JY, Song P, Dang CY (2010b) 关于分离集值有序信息系统中的支配关系。国际技术决策杂志 9(01):9–33

Qian J, Miao DQ, Zhang ZH, Li W (2011) Hybrid approaches to attribute reduction based on indiscernibility and discernibility relation. Int J Approx Reason 52(2):212–230
Qian J, Miao DQ, Zhang ZH, Li W (2011) 基于不可辨别性和可辨别性关系的属性简化的混合方法。国际 J 近似原因 52(2):212–230

Robnik-Sˇikonja M, Kononenko I (2003) Theoretical and empirical
Robnik-Sˇikonja M,Kononenko(2003)理论实证

analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69 Shi Y, Yao L, Xu J (2011) A probability maximization model based
ReliefF 和 RReliefF 的分析。马赫学习 53(1–2):23–69 ShiY,Yao L,Xu J(2011)基于概率最大化模型

on rough approximation and its application to the inventory problem. Int J Approx Reason 52(2):261–280
关于粗略近似及其在库存问题中的应用。国际 J 近似原因 52(2):261–280

Shoemaker CA, Ruiz C (2003) Association rule mining algorithms for set-valued data. In: International conference on intelligent data engineering and automated learning, Springer, Berlin,
Shoemaker CA,Ruiz C(2003)集合值数据的关联规则挖掘算法。在: 智能数据工程自动化学习国际会议,Springer,柏林,

pp. 669–676
669–676

Shu W, Qian W (2014) Mutual information-based feature selection from set-valued data. In: 26th IEEE international conference on tools with artificial intelligence (ICTAI), 2014, IEEE,
Shu W, Qian W (2014) 从集合值数据中选择基于互信息的特征。收录于: 第 26 届 IEEE 人工智能工具国际会议 (ICTAI),2014,IEEE

pp. 733–739
733–739

Wang H, Yue HB, Chen XE (2013) Attribute reduction in interval and set-valued decision information systems. Appl. Math. 4(11):1512
Wang H,Yue HB,Chen XE(2013)区间和设定值决策信息系统的属性减少应用数学4(11):1512

Data sets in articles. http://www.yuhuaqian.com
文章中的数据集http://www.yuhuaqian.com

Yang T, Li Q (2010) Reduction about approximation spaces of covering generalized rough sets. Int J Approx Reason 51(3):335–345
Yang T, Li Q (2010) 关于覆盖广义粗糙集的近似空间的减少。国际 J 近似原因 51(3):335–345

Yang QS, Wang GY, Zhang QH, MA XA (2010) Disjunctive set- valued ordered information systems based on variable precision dominance relation. J. Guangxi Normal Univ Nat Sci Ed 3:84–88 Yang X, Zhang M, Dou H, Yang J (2011) Neighborhood systems- based rough sets in incomplete information system. Knowl
YangQS,Wang GY,Zhang QH,MA XA(2010)基于变精度优势关系的析取集值有序信息系统。J. 广西师范大学自然科学版 3:84–88 YangX,Zhang M,Dou H,Yang J(2011)基于邻域系统的粗糙在不完整的信息系统中。诺尔

Based Syst 24(6):858–867
基础系统24(6):858–867

Yang X, Song X, Chen Z, Yang J (2012) On multigranulation rough sets in incomplete information system. Int J Mach Learn Cybern 3(3):223–232
Yang X, Song X, Chen Z, Yang J (2012) 关于不完整信息系统中的多颗粒粗糙集。国际马赫学习网络杂志 3(3):223–232

Yao YY (2001) Information granulation and rough set approximation.
YaoYY(2001)信息粒度粗糙近似.

Int J Intell Syst 16(1):87–104
国际知识系统杂志16(1):87–104

Yao YY, Liu Q (1999) A generalized decision logic in interval-set- valued information tables. In: International workshop on rough sets, fuzzy sets, data mining, and granular-soft computing, Springer, Berlin, pp. 285–293
Yao YY, Liu Q (1999) 区间集值信息表中的广义决策逻辑。收录于:粗糙集、模糊集、数据挖掘和粒度软计算国际研讨会,Springer,柏林,第 285-293 页

Zadeh LA (1996) Fuzzy sets. In: Fuzzy sets, fuzzy logic, and fuzzy systems: selected papers by Lotfi A Zadeh, pp. 394–432
Zadeh LA (1996) 模糊集。收录于:模糊集、模糊逻辑和模糊系统:Lotfi A Zadeh 论文选集,第 394-432 页

Zhang J, Li T, Ruan D, Liu D (2012) Rough sets based matrix approaches with dynamic attribute variation in set-valued information systems. Int J Approx Reason 53(4):620–635
Zhang J, Li T, Ruan D, Liu D (2012) 在集合值信息系统中具有动态属性变化的基于粗糙集的矩阵方法。国际 J 大约原因 53(4):620–635

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
出版商注:施普林格·自然 (Springer Nature已出版地图中的管辖权主张机构隶属关系保持中立