这是用户在 2024-5-25 16:34 为 https://edstem.org/au/courses/14775/lessons/47633/slides/324211 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Week 9: Classification and Uplift
第 9 周分类和隆起

Summary 摘要

Logistic regression 逻辑回归

y = 11+exy\ =\ \frac{1}{1+e^{-x}}

  • Equation for logistic regression
    逻辑回归公式

y = 11+e(β0+β1x1+β2x2++βpxp)y\ =\ \frac{1}{1+e^{-\left(\beta_0+\beta_1x_1+\beta_2x_2+\dots+\beta_px_p\right)}}

log(y1y)=β0+β1x\log\left(\frac{y}{1-y}\right)=\beta_0+\beta_1x

Logistic regression with sklearn
使用 sklearn 进行逻辑回归

from sklearn.linear_model import LogisticRegression
from  sklearn.linear_model import 舶来品 LogisticRegression 逻辑回归
  • Build the model 建立模型

log_reg = LogisticRegression() log_reg.fit(features, target_class)
log_reg = LogisticRegression() log_reg.fit(features, target_class)
    • [β0\beta_0] \rightarrow .intercept_
      [ β0\beta_0 ] \rightarrow .intercept_

    • [[β1\beta_1, ...., βn\beta_n]] \rightarrow .coef_
      [[ β1\beta_1 , ...., βn\beta_n ]] \rightarrow .coef_

log_reg.predict(features) log_reg.predict_proba(features)
log_reg.predict(features 特征) log_reg.predict_proba(features 特征)

Classification metrics 分类指标

from sklearn.metrics import confusion_matrix confusion_matrix(true_classes, predicted_classes)
from sklearn.metrics import confusion_matrix 混淆矩阵(true_classes、predicted_classes)
from sklearn.metrics import confusion_matrix confusion_matrix(true_classes, predicted_classes)
from sklearn.metrics import confusion_matrix 混淆矩阵(true_classes、predicted_classes)
tn, fp, fn, tp = confusion_matrix(true_classes, predicted_classes).ravel()
tn, fp, fn, tp = confusion_matrix(true_classes, predicted_classes).ravel()

    Precision=TPTP + FPRecall = TPTP+FN\text{Precision}=\frac{TP}{TP\ +\ FP} \hspace{1.5cm}\text{Recall}\ =\ \frac{TP}{TP+FN}

      F1=2 ×precision×recallprecision + recall\text{F}_1=\frac{2\ \times\text{precision} \times \text{recall}}{\text{precision + recall}}

      Customer types 客户类型

      1. Sure things: customers that will achieve the campaign goal with certainty
        有把握的客户:能确保实现活动目标的客户

      2. Persuadables: customers that will respond to the campaign if the are treated e.g. sent a message about a promotion
        可被说服的客户:如果受到对待,例如收到促销信息,就会对活动做出回应的客户

      3. Lost causes: the opposite of "Sure things" in that they will not respond to the campaign regardless of whether they are treated
        失落的原因:与 "肯定的事情 "相反,无论是否对其进行治疗,它们都不会对活动做出反应

      4. Sleeping dogs: customers who will not achieve the campaign goal if treated so they respond negatively to treatment
        睡狗:如果对其进行处理,他们将无法实现活动目标,因此他们会对处理做出消极反应

      Note: treatment here refers to a customer being targeted with the campaign
      注:此处的 "处理 "是指营销活动的目标客户

      Uplift modelling 上升模型

      Control models calculate the probability that a customer naturally achieves the goal of the marketing campaign.
      控制模型计算客户自然实现营销活动目标的概率。

      Treatment models calculate the probability that a customer achieves the goal of the marketing campaign after the customer shown the marketing material i.e. treated.
      治疗模型计算的是客户在观看营销材料(即接受治疗)后实现营销活动目标的概率。

      Uplift is defined as the following:
      上浮的定义如下:

      Uplift=P(action based on treatment)P(action based on no treatment)\text{Uplift} = P(\text{action based on treatment}) - P(\text{action based on no treatment})