1Customer hierarchical RFM model based on partial order set
Yan Yan1, Yue Lizhu1
School of Economics and Management, Huangshan University, Huangshan 245041, Anhui Province
Abstract:Theoretical studies have shown that the classification effect of customers can be improved by extending the indicators of the traditional RFM model, but it conflicts with the literature in terms of indicator weighting, and the opposite weight is obtained in the face of the same scenario. In practice, many models cannot be applied due to the high cost of explicit weight parameters. Therefore, this paper proposes an idea of replacing the exact weights by the weight space, forming the weight space through the personal preference of the evaluator, using the partial order set to express the RFM model with the weight space, and finally obtaining the Hasse diagram that can reflect the customer classification and ranking. The results show that RFM and its extended model can be run only by obtaining the weight order information. The customer level classification based on the Hasse chart can not only show the structured information between customers, but also have good robustness and flexibly reflect the strategic goals of organizational development.
Keywords: RFM model; partial order set; Weight; Customer segmentation
Customer hierarchical RFM model based on poset
Yan Yan1 ,Yue Lizhu1
1. School of Economics and Management, Huangshan University, 245041
Abstract: Theoretical research shows that the classification effect of customers can be improved by expanding the indicators of the traditional RFM model. However, in terms of index weighting, it is found that there is a conflict in the literature, and the opposite weight is obtained in the face of the same scenario. In practice, due to the high cost of defining the weight parameters, many models cannot be applied. Therefore, this paper proposes an idea of replacing the precise weight with the weight space. The weight space is formed by the personal preference of the evaluator. The RFM model containing the weight space is expressed by using the poset, and finally the Hasse diagram which can reflect the customer classification and ranking is obtained. The results show that RFM and its extended model can be run only by obtaining the weight order information. The customer classification based on Hasse graph can not only display the structured information between customers, but also has good robustness and flexibility to reflect the strategic objectives of organizational development.
Keywords: RFM model; Poset; Weight; Customer classification
1 Introduction
Since Hughes (1994) proposed the RFM model [1] , which has so far become one of the most popular models in the field of customer classification. Based on the buyer's behavior records, three indicators are extracted: recency (R), purchase frequency (F), and amount (M) [2].Develop a strategy for acquiring and retaining customers by dividing them into different category hierarchies. Due to its simplicity and ease of communication, the RFM model has become the second most frequently used method in the field of customer analysis, after cross-lists [3].It has become one of the core tools of customer management.
Early research has shown that the effect of differentiating customer response rates to promotions is more recent than purchase frequency [4]. Among the three purchase behavior variables, from the perspective of time, the amount shows a higher influence in the short-term profitability prediction [5]. Many studies have shown that recency, frequency of purchases, and amount of importance are not the same, and it is more realistic to add weight to them.
Adding weights can theoretically improve the classification ability of the model, but in practice, it brings confusion. For the weights of the three indicators, such as recency, purchase frequency, and amount, some scholars believe that 0.731, 0.188, and 0.081 [6], However, some scholars obtained diametrically opposite results, namely 0.058, 0.546 and 0.395 [7]. What is puzzling is that in the face of similar scenarios and applying the same empowerment method, the difference in weights obtained is so large that it is impossible to reconcile.
How to solve such a "changeable" weight in practice? First, it should be recognized that there is no universal weighting, and that weighting is not only dependent on the context, but also closely related to the preferences of decision-makers. In order to solve the variability of weights, multiple weights are used to characterize their variability, instead of using a single weight. When the weight space (uncountable weights) is generated through the decision maker's preference, it will encounter obstacles in calculating the customer RFM value, so the partial order set theory is applied to solve the difficulty of calculation and comparison, and the Hasse diagram that can complete the customer comparison and rank classification is obtained. It is worth noting that the partial sequence set method has some similarities with the text [8], and it is similar to the formal concept analysis (FCA) method, this paper does not use the extra-frontal method, and is more general and flexible.
2 Literature review
The RFM model compares and classifies customers based on their purchasing behavior, based on the following assumptions [9].1) Customers who have recently made a purchase are more likely to buy again than those who have not made a purchase recently; 2) Customers who buy more frequently are more likely to buy the company's products again than those who buy less frequently; 3) Customers with a higher total purchase amount are more likely to buy again and are customers with a higher value. The three assumptions correspond to three variables, namely recency, purchase frequency, and purchase amount, and the scores of the three variables are then combined into the RFM score. Aggregate scores are also sometimes expressed in a "tie" manner, representing a range from 111 to 555 (Haiying and Yu, 2010). [10]), which is the most popular without taking into account weights.
From the perspective of model expansion, the research on RFM models can be divided into two aspects: one is the research on the increase of model indicators; The second is the research on the increase of the weight of model indicators. In addition, the comprehensive application of RFM and clustering model is also a hot topic in the research, considering that the correlation with the topic is not high, the index expansion and index weighting of the model are mainly discussed.
2.1 Research on the addition of indicators to the model
Previous studies have shown that the effectiveness of predicting customer behavior can be improved by adding additional variables to improve RFM models [11]. ]。 Reinartzet al. argue [12] by increasing the lengthThe variable (L), which is the time difference between the client's first trade and the client's last trade, is able to resolve the RFMThe model is not effective in distinguishing between short-term and long-term customer problems. By adding the diversity (D) variable [13], it is possible to reflect the diverse characteristics of the products purchased by customers. In the RFM model, the recency indicator has a greater randomness, that is, the new customer and the old customer may perform the same on the last purchase, With similar records of recent consumption, the customer contribution time dimension is added to distinguish between new and old customers [14].
Zhou et al. [15] introduced the time between purchases (T) in the RFMT model. On the basis of this model, Mensouri et al. [16] added customer satisfaction (S) dimension, to get a deeper understanding of customer satisfaction RFMTS model. In view of the fact that the RFM model is not suitable for industries with distinct social group attributes, the social relationship parameter C is introduced to establish the RFMC model [17]. ]。 In addition, Handojo et al. [21] proposed an RFM method that can calculate particles at different times, and considering the analysis of RFM values in these different time periods, the model can be regarded as adding a temporal dimension.
2.2 Research on index weighting of models
RFM and its extended model need to be used in conjunction with weighting methods, most commonly in combination with analytic hierarchy process [18][19]. [20], combined application with entropy weight method [21], Used in conjunction with factor analysis [22]. On the whole, the analytic hierarchy process is the mainstream of use, and the entropy weight method and factor analysis method are gradually eliminated because they often change significantly with the change of data. In some studies, the ANP approach was combined with RFM [23]. If differences in the number of metrics are not taken into account, a typical form of an empowered RFM model is as follows:
(1)
In the above equation, , and represent the weights of R, F, and M, respectively. The difference in the importance of weights means that they have different effects on customer value, and the higher the weight value, the greater the impact on the customer's value.
In the traditional RFM model, Liu and Shih [6] believe that recency is the most important, followed by the customer's purchase frequency, and the payment amount is relatively the lowest. The weights of the three were 0.731, 0.188 and 0.081 respectively. Using a similar approach, Monalisa et al. [7] obtained weights of 0.058, 0.546, and 0.395, respectively. Based on the opinion of sales veterans, Khajvand et al. [11] determined that the highest value weight of the frequency variable is0.637, followed by a currency weight of 0.258, and finally a recent weight of 0.105. The assignment of RFM weights generally varies greatly, and the following table summarizes the representative research results.
Table 1 Weighting of RFM indicators in different literatures
|
|
|
|
|
Liu, D.R., Shih[6] | 0.731 | 0.188 | 0.081 | AHP |
Monalisa, S., Nadya, P., Novita[7] | 0.058 | 0.546 | 0.395 | AHP |
| 0.105 | 0.637 | 0.258 | AHP |
| 0.17 | 0.47 | 0.35 | Fuzzy AHP |
Chen, K.Y., Wang, C.H[25] | 0.3 | 0.6 | 0.1 | AHP |
| 0.059 | 0.463 | 0.477 | AHP |
The weight difference of the traditional RFM model is obvious, and the weight difference of the extended model is also obvious. Hossaidi et al. [27] applied the AHP method to obtain RFMLThe weights of the models were 0.25, 0.15, 0.5 and 0.1, respectively. Ibrahim et al. [28] applied the AHP methodThe weights of the indicators are 0.134, 0.520, 0.222 and 0.124, respectively. The scenarios of the two studies are similar, using the same weighting method, but the weights are significantly different, which indicates that the weighting results of the model are closely related to the decision-making preference, or the personal preference of the decision-maker should be considered.
The weights reflect the difference in the importance of the indicators, but there is another way to express the importance, which is called non-normative weights, and the corresponding RFM formula is as follows [29]
(2)
This method of analysis describes that recency is much more important than frequency, and frequency is much more important than amount. There is another expression that is similar, but the difference in the weights of the indicators is not so pronounced, namely [30]:
(3)
When the importance of the expression indicators is ranked, Eq. (3) is closer to the general linguistic expression habits, such as the importance of recency is higher than the purchase frequency, and the importance of the purchase frequency is higher than the amount.
It can also be clearly seen from the above literature collation process that the index weights have great variability. Stone [31] suggested that RFM variables should be assigned different weights based on industry characteristics. This study agrees with Stone's view, but also proposes that empowerment should take into account the personal preferences of policymakers. By introducing personal preference information, we can explain the conflict of indicator weighting in the current literature. Why different authors get different weights for the same scenario and the same method stems from the complexity of the real problem. For example, although two e-commerce Hanfu stores operate the same clothing subcategory, because the enterprises are in different stages of development and adopt different business strategies, the index preferences of different stores may not be the same, or the same stores are in different stages of development, and the importance of purchase frequency will be higher in order to increase and improve customer stickiness. Therefore, it is necessary to add personal preference information (which happens to be the weight order of the indicators) to the RFM model.
3RFM model based on partial order set
3.1 Partial Sequence Set Basics
To understand partial order, you can start with "comparison". For example, in the case of a soccer World Cup qualifier, each team needs to be counted and the qualifying team will be determined based on their ranking. Due to cognitive and resource limitations, some subjects cannot be compared in the evaluation process, such as basketball player LeBron James or Mike Jordan, who has a higher historical status? There is no single or fixed answer to any of these questions. When some comparative relationships are clear and some cannot be confirmed, it is necessary to depict them with the help of partial order relations.
Definition 1 [32] assumes that it is a binary relation on a non-empty set, which is called if it satisfies reflexivity, antisymmetry, and transitivityon a partial order relationship.
(1) Reflexivity: for arbitrary, there is;
(2) Antisymmetry: for arbitrary, when and, there is;
(3) Transitivity: For any, when and, there is, it is called the upper partial order relation (the partial order relation is denoted as).
The set and the partial order relation on it together are called partial order sets and are denoted as. Reads "less than or equal to", if or, is said to be comparable with, otherwise withNot comparable.
If any two objects are comparable, the comparison relationship is a full-order relationship, but the full-order is a special case of partial order. The partial order relationship suggests that for any two objects, the two are not necessarily comparable. According to the concept of partial order set, for the evaluation problem on multi-level indicators, the evaluation objects constitute a set, and the rest of the work is to construct the partial order relationship. The following is an example to describe and analyze the partial order analysis process of a weighted RFM model.
Example 1 Suppose there are two clients, and the scores for the three indicators of recency, frequency, and currency variables are shown in Table 2. The index is scored on a 5-point scale. The highest score is 5, and the others are given 4, 3, 2 and 1 points.
Table 2 Metric data of two customers
| R | F | M |
| |||
| 3 | 4 | 2 |
| 5 | 3 | 1 |
According to Eq. (1), the value of the two customers in Example 1 can be calculated, ie
(4)
(5)
If, it means that the customer value of user A is higher than that of B. Due to the unknown weights, it is not possible to directly compare Eq. (4) and Eq. (5). If you get the order of the weights, or the preference of the weights, for example, is it possible to do a scheme comparison? Here's an interesting theorem that answers this question.
Theorem 1 [33] Given the evaluation set, the index weights are assigned, for, if
(6) |
then.
In this theorem, it is the set of schemes, that is, the set of evaluation objects; is the set of indicators; for the evaluation matrix. Eq. (6) is not intuitive enough, and it is expanded to obtain the following formula
(7)
Observe Eq. (7) and find that it does not have weights, and if this equation holds, the scheme comparison can be completed. In addition, the formula has n inequalities, and the number of inequalities is equal to the number of indicators. The first row corresponds to the first most important indicator, the second row corresponds to the first 2 most important indicators, and so on, and the nth row corresponds to the first n most important indicators. If Eq. (7) is true, then (or written).
Assuming the weight order in Example 1, substituting the data in Table 2 into Eq. (7) yields the following result
(8)
It can be seen that the RFM value of b is higher than that of a for this purchaser. If the buyer's preference is, application (7) shows that the RFM value of a is higher than that of bA few.
For Eq. (7), a matrix expression method is given in the paper [33], which is more concise, i.e
((9) |
where is the upper triangular matrix, ie
Eq. (9) is generalized, i.e., the decision matrix is multiplied to obtain a matrix that can be called an additive transformation. According to the right side of Eq. (9), if the vector in row ith is less than or equal to jline, then, it is clear that the comparative relationship between the line satisfies the definition of a partial order relation, so that if the iIf the row vector is less than or equal to the jth line, it is noted.
For Example 1, the additive transformation matrix obtained by applying Eq. (9) is
(10)
It is not difficult to verify that the right end of Eq. (10) corresponds exactly to the right end of Eq. (8).
Without losing the generality, set and, substitute it into formula (9), organize there
(11)
So it can be known, that is.
3.2 Draw a partial order Hasse diagram
Hasse diagrams are named after the German mathematician Helmut Hasse (1898–1979) and are called Hasse diagrams because the mathematician Hasse used them effectively.
Firstly, the matrix of partial order relations is established through the partial order relationship. Given the set of scenarios, if , is denoted as; If it is not comparable, it is denoted as the partial order relationship matrix.
The matrix R is transformed to obtain a matrix without redundant information, and the conversion formula for the two is
(12) |
where is the identity matrix and the operator * is Boolean multiplication. There is a one-to-one correspondence with the Hasse diagram, and the application and interpretation of the Hasse diagram are detailed in Ref. [34].
There is also a more intuitive explanation, for example, for the data in Table 4 below, the partial order relationship matrix is established by comparing the rows and rows, and if so, the arrow line is drawn from the direction, that is. Traversing through all paired nodes, the following directed graph is obtained
Figure 1 Directed graph corresponding to the data in Table 7
However, there is information redundancy in the diagram, and it needs to be "slimmed" down to the most simplified version. For any two nodes, if there are two or more connected paths, if there is a path of length 1, the corresponding arrow line (dotted line in Figure 1) is "cut off". When there are no deletable arrowlines, the diagram is a Hasse diagram. Figure 2 is obtained according to this line cutting criterion.
Fig.2 Hasse plot corresponding to the data in Table 7
3.3 Customer classification based on Hasse diagrams
Hasse diagrams have obvious hierarchical characteristics, not only can they be layered, but there is also a hierarchical relationship between layers. For example, the customer in Figure 2 can be divided into four clear layers, the first layer is {A589,A53,A56}, the second layer is {A527,A575,A58,A580}, the third layer is {A534}, and the fourth layer {A537}. In this example, the better the element, the more at the top of the graph, the average of the upper element must be higher than the lower element, and the relationship is transitive. For example, in Figure 2, the mean RFM of the first layer must be higher than that of the second layer, the mean RFM of the second layer must be higher than that of the third layer, and the mean RFM of the third layer must be higher than that of the fourth layer, and the first layer must be higher than the third and fourth layers according to the transitivity.
Clear hierarchical hierarchy and strict enforcement can motivate people to work towards higher and better levels, as the well-known theory of social comparison processes demonstratesHierarchy plays an important role in achieving accurate self-knowledge and self-improvement, especially in upward social comparison. Hierarchical has the advantage of constructing knowledge representations, which contain both tacit and explicit knowledgeHierarchical clustering not only groups customers, but also shows correlations between groups.
How do you recluster the "layers"? Traditional distance clustering is often criticized for the lack of statistical testing, and the distances on the Hasse diagram belong to the topological distance, which is not suitable for traditional distance calculation. Therefore, a distance-independent, i.e., combined with a non-parametric statistical test, is adopted.
First, according to the given number of layers, the number of layers is "folded in half", dividing the number of layers into two groups. For example, Figure 2 has four layers, the first and second layers are grouped together (for the sake of description, they are denoted as category A), and the third and fourth layers are grouped together (denoted as category B), and then the nonparametric test is used to test whether there is a statistical difference between the two categories, i.e., whether there is a significant difference between the three indicators such as R, F and M in category A and B. If it is statistically significant, it is divided into two groups, and if it is not, it is no longer divided. If A is grouped again, the number of layers in group A is also folded in half, and then the statistical test is carried out, if it is significant, it is divided into two groups, and if it is not significant, the division is stopped.
Since the "layers" of the Hasse diagram are used as statistical units, there is generally no need to group elements in the same layer. This grouping method (tentative name for the sake of narrative convenience) is not only statistically supportive, but also a method of partitioning without a priori parameters. Steps:
The number of floors is numbered from ground to bottom;
If n is an even number, then the first n/2 layers are class A, and the lower n/2 layers are class B;
Check whether there is a statistical difference between the model indicators in the two categories of A and B.
If the difference is significant, the grouping will be stopped, if the difference is not significant, the grouping will be stopped;
Use (1)~(4) until there are no components to be separated, and stop the operation.
There are two points to note here, one is that when n is an odd number, it can be approximated as n-1 to be treated, and n must be much greater than 1; The stop condition of grouping can be set according to practical needs, whether the traditional RFM model or its extended model involves multiple indicators, there will be multiple indicators significant or not significant, and when several indicators are not significant, the division will be stopped, which can be determined by considering the number of customers, and there is no specific theoretical standard.
4 Application examples
The dataset used in the study is from a Chinese Hanfu e-commerce company. It includes the purchase of the company's hanfu between April 20 19 and August 20223,447 bits of customer data. There are 8 columns, including buyer nickname, buyer level, province, and date of first purchase(First Purchase Date), Latest Purchase Date (Near Purchase Date), Number of Purchases ( F), the cumulative purchase amount (M, unit yuan), whether the first order is advertising guidance (advertising).。
Table 3 Raw metric data (fragments)
|
|
|
|
| F | M |
|
nanxinli | L3 |
| 2019-04-19 | 2022-05-07 | 2 | 1828 | 否 |
| L3 |
| 2019-05-02 | 2021-08-29 | 1 | 90 | 否 |
| L1 |
| 2019-05-03 | 2021-09-22 | 1 | 192 | 否 |
longdanever | L1 |
| 2019-05-09 | 2022-05-08 | 1 | 120 | 否 |
| L2 |
| 2019-05-10 | 2022-02-27 | 1 | 609 | 否 |
| L1 |
| 2019-05-12 | 2022-03-19 | 1 | 158 | 否 |
| L1 |
| 2019-05-15 | 2021-09-09 | 1 | 2080 | 否 |
| L2 |
| 2019-05-15 | 2021-12-23 | 2 | 2960 | 否 |
4.1 Data Preprocessing
Considering that the time span of the dataset is less than three years, all customers in that period of time are classified by applying the RFM model. The date of observation is August 12, 2022, and the recency is calculated based on the date of the most recent purchase, i.e., the date of observation is subtracted from the date of recent purchase to obtain the indicator R。 Considering that the number of purchases is greater than or equal to 2 times, there is a repurchase problem, so the customer who deletes the number of purchases equal to 1 finally obtains 690 customer records.
The distribution of indicator data tends to be nonlinear and non-normal. Second, taking monetary indicators as an example, many of the top 5% of revenue have accounted for more than half of revenue, so taking the top 20% as a category will overestimate the proportion of gold customers. Due to the large difference between the maximum and minimum values of the indicators, the logarithmic transformation of the three indicators such as R, F and M is carried out with e as the base, and the specific formula is as follows:
Let the recency index be the minimum value and the normalized index value is
(13)
The maximum and minimum of the indicator are set in a similar way, and the normalized formula for the frequency and amount indicators is obtained as:
(14)
(15)
Normalized data are obtained from Eq. (13) ~ Eq. (15) (Table 4)
Table 4: RFM metrics (fragments) after logarithmic processing
| R | F | M |
A527 | 0.0313 | 0.1365 | 0.8025 |
A53 | 0.3739 | 0.0000 | 0.2118 |
A534 | 0.0152 | 0.1365 | 0.4104 |
A537 | 0.0142 | 0.1365 | 0.2118 |
A56 | 0.3697 | 0.0000 | 0.3662 |
A575 | 0.0382 | 0.2334 | 0.3256 |
A58 | 0.3577 | 0.0000 | 0.2130 |
A580 | 0.0132 | 0.2334 | 0.3759 |
A589 | 0.2627 | 0.3085 | 0.6600 |
The RFM indicators obtained show that there are 690 customers who have made more than 2 repeat purchases, accounting for 20.017% of the total number of customers. After logarithmic processing of the three indicators, the maximum values of the three RFM indicators are 5.8916, 3.6636 and 8.3544, respectively. The standard deviation of the three indicators is less than 1, and the indicators after the logarithm are in the same magnitude, which lays the foundation for the introduction of weights, and the statistical distribution information of the three indicators is as follows:
Table 5 Statistics of RFM metrics
| R | F | M |
| 690 | 690 | 690 |
| 4.374840 | 1.036027 | 6.067525 |
| 1.0979443 | .5739282 | .9098434 |
| .0000 | .6931 | 4.4773 |
| 5.8916 | 3.6636 | 8.3544 |
4.2 Hasse diagram of the customer group RFM model
Hanfu is a kind of clothing, and its customers have the characteristics of long choice time and many categories. In this case, the company experienced an increase in customer churn while growing rapidly, and is currently adopting an operational strategy to reduce customer churn. In this context, new progress has become a more important indicator, and the purchase frequency is second only to new progress, and the purchase amount is the least important, so there is. According to equation (11), the index data X is transformed additively, i.e. For example, the result of the additive transformation of the data in Table 2 is Table 6.
Table 6 Cumulative transformation of some data
| R | R+F | R+F+M |
A527 | 0.0313 | 0.1678 | 0.9704 |
A53 | 0.3739 | 0.3739 | 0.5856 |
A534 | 0.0152 | 0.1517 | 0.5621 |
A537 | 0.0142 | 0.1507 | 0.3624 |
A56 | 0.3697 | 0.3697 | 0.7359 |
A575 | 0.0382 | 0.2716 | 0.5972 |
A58 | 0.3577 | 0.3577 | 0.5707 |
A580 | 0.0132 | 0.2465 | 0.6224 |
A589 | 0.2627 | 0.5712 | 1.2312 |
The row vectors are compared, that is, the comparison relationship matrix is established, and the Hasse matrix is obtained according to Eq. (12). Drawing a Hasse diagram based on the Hasse matrix, since the group of repeat customers has reached 690 people, it is no longer possible to intuitively capture the relationship between elements because there are too many elements, and the intuitiveness has been lost. To do this, 100 customers are randomly selected to generate a Hasse chart
Figure 3 100 randomly selected customers
Although Figure 3 shows only 100 random customers, the morphology of the graph reflects the overall characteristics of the customer to a certain extent, that is, the graph is wider at the top and narrower at the bottom, the wider the figure indicates more elements in the same layer, and the narrower it means fewer elements in the same layer. The more you go to the top of the chart, the more high-quality customers there are, and the shape of the graph reflects that the company has accumulated a number of high-quality customers in the past three years of operation, and these customers have become the core assets of the company's sustainable development.
In order to observe the distribution of customers on different layers in more detail, the top of the Hasse diagram is the first layer, the second top is the second layer, and so on, and the number of elements in each layer is the vertical axis, and the line diagram is drawn as follows
Figure 4 The distribution of the number of layers in the customer group
Overall, the number of elemental layers has actually reached 89. The distribution of the number of layers of the customer group is uneven, with the largest layer having 23 people and the lowest layer having 1 person, and the average number of customers per layer is about 10. The number of layers with more elements is concentrated in the first 20 layers, from the 32nd layer to the 89th floor, and the element distribution does not exceed 10 people per layer.
4.2 Customer clustering based on the Hasse layer
According to the Hasse diagram, there are a total of 89 layers from top to bottom, and the number of layers is classified based on the number of layers. Category division stop condition: When two or more indicators are met, the category is divided, otherwise, it defaults to one category and is no longer divided.
First of all, it is looked at as a whole, and then the number of middle layers is used as the dividing point, here 1~44 layers are one category, and 45~89 layers are another. Using the non-parametric test method, it was found that the R, F and M indexes in Table 1 were significant, so they were divided into two categories. After that, the 1~44 layers were divided into two categories, and it was found that the three indicators were significant. The 45~89 layers were divided into two categories, and the R and M indexes were found to be significant, and the 67~89 layers were tested, and only the index R was significant, and the classification was stopped when there was one index or no index was significant. For example, when the 1~16 layers are redivided, only the indicator R is significant, so it is considered to be one category by default. By analogy, the classification results are as follows.
Figure 5 Customer-level classification chart
According to the results at the right end of the classification tree in Figure 5, the 690 customers in this example can finally be divided into 9 categories (Table 8), and the categories have priority among them, in Table 8, class A is better than class B, class B is better than class C, and so on, until class H is better than class I.
Table 7 Customer clusters, number of layers, and elements
|
|
|
|
A | 1--5 | 35 | 5.07% |
B | 6~11 | 87 | 12.61% |
C | 12~16 | 89 | 12.90% |
D | 17~22 | 102 | 14.78% |
E | 23~33 | 118 | 17.10% |
F | 34~44 | 55 | 7.97% |
G | 45~66 | 132 | 19.13% |
H | 67~77 | 44 | 6.38% |
I | 78~89 | 28 | 4.06% |
From the coarse particle division, the customers of the first 1~5 layers are the core assets of the company, and the elements of this layer account for 5.07% of the repurchase group。 Note that the core assets here are closely related to the company's strategic positioning, and the current strategy to reduce customer churn is extremely important in the context of Class A customers. In addition, Class B is also a key customer resource of the enterprise, and Class C is an important customer resource of the enterprise. Categories H and I here belong to customer resources that are not important or can be discarded. Different marketing strategies can be adopted around different resource levels, such as activating D or E customers with new products, and awakening G or F customers with promotions.
5 Conclusion
The partial order RFM model solves the weighting problem, and only needs the application to determine the order of the weights of the three indicators such as recency, frequency and currency according to the practical situation, and then the level classification of customers can be completed. Since the difficulty of identifying the weight order is much less than that of the exact weights, the running cost of the model is significantly reduced. The weight order also represents the index preference of the decision-maker, and in this sense, the partial order RFM model is a model combined with personal preference information, reflecting the integration of experience and data. The integration of personal preference information explains the problem of different rights in the same context, even if the same enterprise has different weight preferences at different stages of development, so the partial order RFM and its extension model can effectively reflect the development strategy of the organization.
Customer classification based on Hasse diagrams is different from traditional k-means classification, which divides objects into several categories, and generally there is no hierarchical relationship between classes, while HasseThere is an obvious hierarchical relationship between the categories on the basis of the graph, which is convenient for allocating resources according to the category level and determining the priority of customers of different levels. In addition, from the formal point of view, the classification method in this paper is similar to that of the systematic classification method, which can be used to classify coarse-grained and fine-grained grains, but the method in this paper can reflect the hierarchical relationship between particles.
References