这是用户在 2025-4-30 20:41 为 file:///Users/zoe/Downloads/Gaussian_Process_in_Machine_Learning_Notation.html 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
ISBN 026218253X. (C) 2006 Massachusetts Institute of Technology. www.GaussianProcess.org/gpml
ISBN 026218253X。版权所有 © 2006 麻省理工学院。www.GaussianProcess.org/gpml

Symbols and Notation  符号与表示法


Matrices are capitalized and vectors are in bold type. We do not generally distinguish between probabilities and probability densities. A subscript asterisk,such as in X ,indicates reference to a test set quantity. A superscript asterisk denotes complex conjugate.
矩阵用大写字母表示,向量以粗体显示。我们通常不区分概率与概率密度。下标星号(如 X )表示测试集相关量,上标星号代表复共轭。

Symbol Meaning  符号含义

\ left matrix divide: Ab is the vector x which solves Ax=b
左矩阵除法: Ab 是向量 x 的解,满足 Ax=b

△ an equality which acts as a definition
△ 作为定义的等式

=c equality up to an additive constant
=c 等式相差一个加法常数

\( K\) determinant of K matrix
\) 矩阵 K 的行列式

y Euclidean length of vector y ,i.e. (iyi2)1/2
y 向量 y 的欧几里得长度,即 (iyi2)1/2

f,gH RKHS inner product   f,gH 的再生核希尔伯特空间内积

fH RKHS norm   fH 的再生核希尔伯特空间范数

y the transpose of vector y
y 向量 y 的转置

✘ proportional to; e.g. p(xy)f(x,y) means that p(xy) is equal to f(x,y) times
✘ 正比于;例如 p(xy)f(x,y) 表示 p(xy) 等于 f(x,y) 乘以

a factor which is independent of x
x 无关的因子

~ distributed according to; example: xN(μ,σ2)
~ 服从...分布;示例: xN(μ,σ2)

or f partial derivatives (w.r.t. f)
f 关于 f 的偏导数

VV the (Hessian) matrix of second derivatives
VV 二阶导数的(海森)矩阵

0n vector of all 0 ’s (of length n )
0n 全零向量(长度为 n

1 or 1n vector of all 1 ’s (of length n )
1 或 1n 全 1 向量(长度为 n

Cnumber of classes in a classification problem
分类问题中的类别数量

cholesky(A) Cholesky decomposition: L is a lower triangular matrix such that LL=A
cholesky(A) 楚列斯基分解: L 是一个下三角矩阵,满足 LL=A

cov(f) Gaussian process posterior covariance
cov(f) 高斯过程后验协方差

Ddimension of input space X
输入空间维度 X

Ddata set: D={(xi,yi)i=1,,n}  数据集: D={(xi,yi)i=1,,n}

diag(w) (vector argument) a diagonal matrix containing the elements of vector w
diag(w)(向量参数)一个对角矩阵,其元素由向量 w 组成

diag(W) (matrix argument) a vector containing the diagonal elements of matrix W
diag(W) (矩阵参数)一个向量,包含矩阵 W 的对角线元素

δpq Kronecker delta, δpq=1 iff p=q and 0 otherwise
δpq 克罗内克δ函数,当 δpq=1 时等于 p=q ,否则为 0

E or Eq(x)[z(x)] expectation; expectation of z(x) when xq(x)
EEq(x)[z(x)] 期望;当 xq(x)z(x) 的期望

f(x) or f Gaussian process (or vector of) latent function values, f=(f(x1),,f(xn))
f(x)f 高斯过程(或向量)的潜在函数值, f=(f(x1),,f(xn))

f Gaussian process (posterior) prediction (random variable)
f 高斯过程(后验)预测(随机变量)

f Gaussian process posterior mean
f 高斯过程后验均值

GP Gaussian process: fGP(m(x),k(x,x)) ,the function f is distributed as a
高斯过程: fGP(m(x),k(x,x)) ,函数 f 服从以

Gaussian process with mean function m(x) and covariance function k(x,x)
高斯过程分布,其均值函数为 m(x) ,协方差函数为 k(x,x)

h(x) or h(x) either fixed basis function (or set of basis functions) or weight function
h(x)h(x) 可以是固定基函数(或一组基函数)或权重函数

H or H(X) set of basis functions evaluated at all training points
HH(X) 表示在所有训练点处评估得到的一组基函数

I or In the identity matrix (of size n )
IIn 单位矩阵(大小为 n

Jν(z) Bessel function of the first kind
Jν(z) 第一类贝塞尔函数

k(x,x) covariance (or kernel) function evaluated at x and x
k(x,x)xx 处评估的协方差(或核)函数

K or K(X,X) n×n covariance (or Gram) matrix
KK(X,X) n×n 协方差(或 Gram)矩阵

K n×n matrix K(X,X) ,the covariance between training and test cases
K n×n 矩阵 K(X,X) ,训练集与测试集之间的协方差

k(x) or k vector,short for K(X,x) ,when there is only a single test case
k(x)k 向量, K(X,x) 的简称,当仅存在单一测试用例时

Kf or K covariance matrix for the (noise free) f values
KfK (无噪声) f 值的协方差矩阵



C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006, ISBN 026218253X. (c) 2006 Massachusetts Institute of Technology. www. GaussianProcess.org/gpml
C. E. Rasmussen 与 C. K. I. Williams 合著,《机器学习中的高斯过程》,麻省理工学院出版社,2006 年,ISBN 026218253X。版权所有©2006 麻省理工学院。网址:www.GaussianProcess.org/gpml

xviii  十八

Symbol Meaning  符号含义

Ky covariance matrix for the (noisy) y values; for independent homoscedastic noise,
Ky 表示(含噪声的) y 值的协方差矩阵;适用于独立同方差噪声情况

Ky=Kf+σn2I

Kν(z) modified Bessel function
Kν(z) 修正贝塞尔函数

L(a,b) loss function,the loss of predicting b ,when a is true; note argument order
L(a,b) 损失函数,当真实值为 a 时,预测 b 所产生的损失;注意参数顺序

log(z) natural logarithm (base e )
log(z) 自然对数(底数为 e

log2(z) logarithm to the base 2
log2(z) 以 2 为底的对数

or d characteristic length-scale (for input dimension d )
d 特征长度尺度(针对输入维度 d

λ(z) logistic function, λ(z)=1/(1+exp(z))
λ(z) 逻辑函数, λ(z)=1/(1+exp(z))

m(x) the mean function of a Gaussian process
高斯过程的均值函数

μ a measure (see section A.7)
一种测度(参见附录 A.7)

N(μ,) or N(xμ,) (the variable x has a) Gaussian (Normal) distribution with mean vector μ and
N(xμ,) (变量 x 服从)均值为向量 μ 的(正态)高斯分布

covariance matrix   协方差矩阵

N(x) short for unit Gaussian xN(0,I)
N(x) 单位高斯的简称 xN(0,I)

n and n number of training (and test) cases
nn 训练(及测试)样本的数量

Ndimension of feature space
特征空间的维度

NH number of hidden units in a neural network
NH 神经网络中隐藏单元的数量

Nthe natural numbers, the positive integers
N 自然数,即正整数

O() big Oh; for functions f and g on N ,we write f(n)=O(g(n)) if the ratio
O() 大 O 符号;对于定义在 N 上的函数 fg ,我们记作 f(n)=O(g(n)) ,如果比值

f(n)/g(n) remains bounded as n
f(n)/g(n)n 时保持有界

Oeither matrix of all zeros or differential operator
全零矩阵或微分算子

yx and p(yx) conditional random variable y given x and its probability (density)
yxp(yx) 的条件随机变量 y 给定 x 及其概率(密度)

PN the regular n -polygon
PN 规则 n -多边形

ϕ(xi) or Φ(X) feature map of input xi (or input set X )
ϕ(xi)Φ(X) 输入 xi (或输入集 X )的特征映射

Φ(z) cumulative unit Gaussian: Φ(z)=(2π)1/2zexp(t2/2)dt
Φ(z) 累积单位高斯: Φ(z)=(2π)1/2zexp(t2/2)dt

π(x) the sigmoid of the latent value: π(x)=σ(f(x)) (stochastic if f(x) is stochastic)
潜在值的 Sigmoid 函数: π(x)=σ(f(x)) (若 f(x) 为随机变量,则结果亦随机)

π^(x) MAP prediction: π evaluated at f¯(x) .
MAP 预测:在 f¯(x) 处评估的 π

π¯(x) mean prediction: expected value of π(x) . Note,in general that π^(x)π¯(x)
均值预测: π(x) 的期望值。注意,通常而言 π^(x)π¯(x)

Rthe real numbers  R 实数集

RL(f) or RL(c) the risk or expected loss for f ,or classifier c (averaged w.r.t. inputs and outputs)
RL(f)RL(c) 针对 f 的风险或预期损失,或分类器 c (相对于输入和输出的平均值)

R~L(lx) expected loss for predicting l ,averaged w.r.t. the model’s pred. distr. at x
预测 l 的期望损失,相对于模型在 x 处的预测分布取平均

Rc decision region for class c
类别 c 的决策区域

S(s) power spectrum   S(s) 功率谱

σ(z) any sigmoid function,e.g. logistic λ(z) ,cumulative Gaussian Φ(z) ,etc.
σ(z) 任意 S 型函数,例如逻辑函数 λ(z) 、累积高斯函数 Φ(z) 等。

σf2 variance of the (noise free) signal
σf2 (无噪声)信号的方差

σn2 noise variance   σn2 噪声方差

θ vector of hyperparameters (parameters of the covariance function)
θ 超参数向量(协方差函数的参数)

tr(A) trace of (square) matrix A
tr(A) (方阵的)迹 A

Tl the circle with circumference l
Tl 周长为 l 的圆

V or Vq(x)[z(x)] variance; variance of z(x) when xq(x)
VVq(x)[z(x)] 方差;当 xq(x)z(x) 的方差

Xinput space and also the index set for the stochastic process
输入空间 X,同时也是随机过程的索引集

XD×n matrix of the training inputs {xi}i=1n : the design matrix
X D×n 训练输入矩阵 {xi}i=1n :设计矩阵

X matrix of test inputs
X 测试输入矩阵

xi the i th training input
xii 个训练输入

xdi the d th coordinate of the i th training input xi
xdii 个训练输入 xi 的第 d 个坐标

Zthe integers ,2,1,0,1,2,  整数 ,2,1,0,1,2,