- 引言
- 回顾:线性判别分析策略构建思想
- 模型参数求解过程
 
上一节介绍了线性判别分析(Linear Discriminant Analysis)的策略构建思想以及策略思想的数学符号实现过程。本节将基于策略思想继续介绍线性判别分析的模型参数求解过程。
回顾:线性判别分析策略构建思想由于线性判别分析的本质依然是 使用直线(超平面)对样本空间进行划分,但是它的特点是将线性模型中的模型参数 W \mathcal W W赋予一个实际意义:模型参数 W \mathcal W W是 p p p维样本空间映射到一维空间的参考系。
- 由于模型 W T x ( i ) + b \mathcal W^{T}x^{(i)} + b WTx(i)+b与参考系 W \mathcal W W之间是 垂直关系,因此一旦参考系 W \mathcal W W被确定,和参考系 W \mathcal W W相垂直的模型斜率 W T x ( i ) \mathcal W^{T}x^{(i)} WTx(i)必然被确定。
- 当参考系 W \mathcal W W被确定后,只需要找到能够将两类投影样本划分的最优阈值,将模型 W T x ( i ) + b \mathcal W^{T}x^{(i)} + b WTx(i)+b经过最优阈值得到 b b b,最终确定完整模型。
可以看出,寻找最优参考系 W ^ \hat {\mathcal W} W^与构建模型 W ^ T x ( i ) + b \hat {\mathcal W}^{T}x^{(i)} + b W^Tx(i)+b本质上是同一个任务。 如何寻找最优参考系 W ^ \hat {\mathcal W} W^? 换句话说,判别参考系 W \mathcal W W优劣性的标准是什么?根据线性判别分析高内聚、低耦合的思想,将判别标准从类内(with classes)和类内(Between classes)两个角度进行判定:
- 类内:将各类标签对应样本点的方差作为各类样本内部凝聚程度的综合考量;
- 类间:将各类标签对应样本点取均值,各类均值的差距作为各类样本之间差异性的综合考量;
将类内、类间两种角度相融合——策略(损失函数)既要满足类内角度的要求,也要满足类间角度的要求。基于上一节的场景描述,得到的损失函数结果 J ( W ) \mathcal J(\mathcal W) J(W)表示如下: J ( W ) = ( Z 1 ˉ − Z 2 ˉ ) 2 S 1 + S 2 \begin{aligned}\mathcal J(\mathcal W) & = \frac{(\bar {\mathcal Z_1} - \bar {\mathcal Z_2})^2}{\mathcal S_1 + \mathcal S_2} \\ \end{aligned} J(W)=S1+S2(Z1ˉ−Z2ˉ)2
其中, Z j ^ ( j = 1 , 2 ) \hat {\mathcal Z_j}(j=1,2) Zj^(j=1,2)表示 各类映射样本的均值结果: Z j ^ = 1 N j ∑ x ( i ) ∈ X C j W T x ( i ) \hat {\mathcal Z_j} = \frac{1}{N_j}\sum_{x^{(i)} \in \mathcal X_{C_j}}\mathcal W^{T}x^{(i)} Zj^=Nj1x(i)∈XCj∑WTx(i) S j ( j = 1 , 2 ) \mathcal S_j(j=1,2) Sj(j=1,2)表示 各类映射样本的方差结果: S j = 1 N j ∑ x ( i ) ∈ X C j ( W T x ( i ) − Z j ˉ ) ( W T x ( i ) − Z j ˉ ) T \mathcal S_j = \frac{1}{N_j}\sum_{x^{(i)} \in \mathcal X_{C_j}}(\mathcal W^{T}x^{(i)} - \bar {\mathcal Z_j})(\mathcal W^{T}x^{(i)} - \bar {\mathcal Z_j})^{T} Sj=Nj1x(i)∈XCj∑(WTx(i)−Zjˉ)(WTx(i)−Zjˉ)T 经过化简,关于参考系 W \mathcal W W的策略表示如下: J ( W ) = W T ( X C 1 ˉ − X C 2 ˉ ) ( X C 1 ˉ − X C 2 ˉ ) T W W T ( S C 1 + S C 2 ) W \mathcal J(\mathcal W) = \frac{\mathcal W^{T}(\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})(\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})^{T}\mathcal W}{\mathcal W^{T}(\mathcal S_{C_1} + \mathcal S_{C_2})\mathcal W} J(W)=WT(SC1+SC2)WWT(XC1ˉ−XC2ˉ)(XC1ˉ−XC2ˉ)TW 其中 X C j ˉ ( j = 1 , 2 ) \bar {\mathcal X_{C_j}}(j=1,2) XCjˉ(j=1,2)表示 各类原始样本的均值结果: X C j ˉ = 1 N j ∑ x ( i ) ∈ X C j x ( j ) \bar {\mathcal X_{C_j}} = \frac{1}{N_j}\sum_{x^{(i)} \in \mathcal X_{C_j}} x^{(j)} XCjˉ=Nj1x(i)∈XCj∑x(j) S C j ( j = 1 , 2 ) \mathcal S_{C_j}(j=1,2) SCj(j=1,2)表示 各类原始样本的方差结果: S C j = 1 N j ∑ x ( j ) ∈ X C j ( x ( j ) − X C j ˉ ) ( x ( j ) − X C j ˉ ) T \mathcal S_{C_j} = \frac{1}{N_j} \sum_{x^{(j)} \in \mathcal X_{C_j}}(x^{(j)} - \bar {\mathcal X_{C_j}})(x^{(j)} - \bar {\mathcal X_{C_j}})^{T} SCj=Nj1x(j)∈XCj∑(x(j)−XCjˉ)(x(j)−XCjˉ)T
模型参数求解过程重新观察 
     
      
       
       
         J 
        
       
         ( 
        
       
         W 
        
       
         ) 
        
       
      
        \mathcal J(\mathcal W) 
       
      
    J(W),定义分子的中间项为类间方差,用  
     
      
       
        
        
          S 
         
         
         
           b 
          
         
           e 
          
         
           t 
          
         
        
       
      
        \mathcal S_{bet} 
       
      
    Sbet表示。即:  
     
      
       
        
        
          S 
         
         
         
           b 
          
         
           e 
          
         
           t 
          
         
        
       
      
        \mathcal S_{bet} 
       
      
    Sbet是 
     
      
       
       
         p 
        
       
         × 
        
       
         p 
        
       
      
        p \times p 
       
      
    p×p维矩阵;  
      
       
        
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          = 
         
        
          ( 
         
         
          
          
            X 
           
           
           
             C 
            
           
             1 
            
           
          
         
           ˉ 
          
         
        
          − 
         
         
          
          
            X 
           
           
           
             C 
            
           
             2 
            
           
          
         
           ˉ 
          
         
        
          ) 
         
        
          ( 
         
         
          
          
            X 
           
           
           
             C 
            
           
             1 
            
           
          
         
           ˉ 
          
         
        
          − 
         
         
          
          
            X 
           
           
           
             C 
            
           
             2 
            
           
          
         
           ˉ 
          
         
         
         
           ) 
          
         
           T 
          
         
        
       
         \mathcal S_{bet} = (\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})(\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})^{T} 
        
       
     Sbet=(XC1ˉ−XC2ˉ)(XC1ˉ−XC2ˉ)T
定义分母的中间项为类内方差,用 
     
      
       
        
        
          S 
         
         
         
           w 
          
         
           i 
          
         
           t 
          
         
           h 
          
         
        
       
      
        \mathcal S_{with} 
       
      
    Swith表示。即:  
     
      
       
        
        
          S 
         
         
         
           w 
          
         
           i 
          
         
           t 
          
         
           h 
          
         
        
       
      
        \mathcal S_{with} 
       
      
    Swith也是 
     
      
       
       
         p 
        
       
         × 
        
       
         p 
        
       
      
        p \times p 
       
      
    p×p维矩阵;  
      
       
        
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
         
        
          = 
         
         
         
           S 
          
          
          
            C 
           
          
            1 
           
          
         
        
          + 
         
         
         
           S 
          
          
          
            C 
           
          
            2 
           
          
         
        
       
         \mathcal S_{with} = \mathcal S_{C_1} + \mathcal S_{C_2} 
        
       
     Swith=SC1+SC2
策略 J ( W ) \mathcal J(\mathcal W) J(W)将重新化简为: J ( W ) = W T S b e t W W T S w i t h W = W T S b e t W ( W T S w i t h W ) − 1 \begin{aligned}\mathcal J(\mathcal W) & = \frac{\mathcal W^{T} \mathcal S_{bet} \mathcal W}{\mathcal W^{T}\mathcal S_{with}\mathcal W} \\ & = \mathcal W^{T} \mathcal S_{bet} \mathcal W (\mathcal W^{T} \mathcal S_{with} \mathcal W)^{-1} \end{aligned} J(W)=WTSwithWWTSbetW=WTSbetW(WTSwithW)−1
直接对 
     
      
       
       
         J 
        
       
         ( 
        
       
         W 
        
       
         ) 
        
       
      
        \mathcal J(\mathcal W) 
       
      
    J(W)求导: 需要补一下矩阵论中的矩阵乘法求导~  
      
       
        
         
          
          
            ∂ 
           
          
            ( 
           
           
           
             W 
            
           
             T 
            
           
           
           
             S 
            
            
            
              b 
             
            
              e 
             
            
              t 
             
            
           
          
            W 
           
          
            ) 
           
          
          
          
            ∂ 
           
          
            W 
           
          
         
        
          = 
         
        
          2 
         
        
          × 
         
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          W 
         
         
         
          
           
            
             
              
              
                ∂ 
               
              
                J 
               
              
                ( 
               
              
                W 
               
              
                ) 
               
              
              
              
                ∂ 
               
              
                W 
               
              
             
            
           
           
            
             
              
             
               = 
              
             
               2 
              
             
               × 
              
              
              
                S 
               
               
               
                 b 
                
               
                 e 
                
               
                 t 
                
               
              
             
               W 
              
             
               ( 
              
              
              
                W 
               
              
                T 
               
              
              
              
                S 
               
               
               
                 w 
                
               
                 i 
                
               
                 t 
                
               
                 h 
                
               
              
             
               W 
              
              
              
                ) 
               
               
               
                 − 
                
               
                 1 
                
               
              
             
               + 
              
              
              
                W 
               
              
                T 
               
              
              
              
                S 
               
               
               
                 b 
                
               
                 e 
                
               
                 t 
                
               
              
             
               W 
              
             
               × 
              
             
               ( 
              
             
               − 
              
             
               1 
              
             
               ) 
              
             
               ( 
              
              
              
                W 
               
              
                T 
               
              
              
              
                S 
               
               
               
                 w 
                
               
                 i 
                
               
                 t 
                
               
                 h 
                
               
              
             
               W 
              
              
              
                ) 
               
               
               
                 − 
                
               
                 2 
                
               
              
             
               × 
              
             
               2 
              
             
               × 
              
              
              
                S 
               
               
               
                 w 
                
               
                 i 
                
               
                 t 
                
               
                 h 
                
               
              
             
               W 
              
             
            
           
          
          
           
            
             
            
           
           
            
             
              
             
               = 
              
              
              
                S 
               
               
               
                 b 
                
               
                 e 
                
               
                 t 
                
               
              
             
               W 
              
             
               ( 
              
              
              
                W 
               
              
                T 
               
              
              
              
                S 
               
               
               
                 w 
                
               
                 i 
                
               
                 t 
                
               
                 h 
                
               
              
             
               W 
              
              
              
                ) 
               
               
               
                 − 
                
               
                 1 
                
               
              
             
               − 
              
             
               2 
              
             
               × 
              
              
              
                W 
               
              
                T 
               
              
              
              
                S 
               
               
               
                 b 
                
               
                 e 
                
               
                 t 
                
               
              
             
               W 
              
             
               × 
              
             
               ( 
              
              
              
                W 
               
              
                T 
               
              
              
              
                S 
               
               
               
                 w 
                
               
                 i 
                
               
                 t 
                
               
                 h 
                
               
              
             
               W 
              
              
              
                ) 
               
               
               
                 − 
                
               
                 2 
                
               
              
              
              
                S 
               
               
               
                 w 
                
               
                 i 
                
               
                 t 
                
               
                 h 
                
               
              
             
               W 
              
             
            
           
          
         
        
       
         \frac{\partial (\mathcal W^{T}\mathcal S_{bet} \mathcal W)}{\partial \mathcal W} = 2 \times \mathcal S_{bet} \mathcal W \\ \begin{aligned}\frac{\partial \mathcal J(\mathcal W)}{\partial \mathcal W} & = 2 \times \mathcal S_{bet} \mathcal W (\mathcal W^{T}\mathcal S_{with} \mathcal W)^{-1} + \mathcal W^{T}\mathcal S_{bet} \mathcal W \times (-1) (\mathcal W^{T}\mathcal S_{with}\mathcal W)^{-2} \times 2 \times \mathcal S_{with}\mathcal W \\ & = \mathcal S_{bet}\mathcal W(\mathcal W^{T} \mathcal S_{with} \mathcal W)^{-1} - 2 \times \mathcal W^{T}\mathcal S_{bet} \mathcal W \times (\mathcal W^{T} \mathcal S_{with} \mathcal W)^{-2} \mathcal S_{with}\mathcal W\end{aligned} 
        
       
     ∂W∂(WTSbetW)=2×SbetW∂W∂J(W)=2×SbetW(WTSwithW)−1+WTSbetW×(−1)(WTSwithW)−2×2×SwithW=SbetW(WTSwithW)−1−2×WTSbetW×(WTSwithW)−2SwithW
令 
     
      
       
        
         
         
           ∂ 
          
         
           J 
          
         
           ( 
          
         
           W 
          
         
           ) 
          
         
         
         
           ∂ 
          
         
           W 
          
         
        
       
         ≜ 
        
       
         0 
        
       
      
        \frac{\partial \mathcal J(\mathcal W)}{\partial \mathcal W} \triangleq 0 
       
      
    ∂W∂J(W)≜0,等式两端同时乘以 
     
      
       
       
         ( 
        
        
        
          W 
         
        
          T 
         
        
        
        
          S 
         
         
         
           w 
          
         
           i 
          
         
           t 
          
         
           h 
          
         
        
       
         W 
        
        
        
          ) 
         
        
          2 
         
        
       
      
        (\mathcal W^{T}\mathcal S_{with} \mathcal W)^2 
       
      
    (WTSwithW)2,有:  
      
       
        
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          W 
         
        
          ( 
         
         
         
           W 
          
         
           T 
          
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
         
        
          W 
         
        
          ) 
         
        
          − 
         
        
          ( 
         
         
         
           W 
          
         
           T 
          
         
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          W 
         
        
          ) 
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
         
        
          W 
         
        
          = 
         
        
          0 
         
         
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          W 
         
        
          ( 
         
         
         
           W 
          
         
           T 
          
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
         
        
          W 
         
        
          ) 
         
        
          = 
         
        
          ( 
         
         
         
           W 
          
         
           T 
          
         
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          W 
         
        
          ) 
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
         
        
          W 
         
        
       
         \mathcal S_{bet} \mathcal W (\mathcal W^{T} \mathcal S_{with} \mathcal W) - (\mathcal W^{T} \mathcal S_{bet} \mathcal W) \mathcal S_{with} \mathcal W = 0 \\ \mathcal S_{bet}\mathcal W(\mathcal W^{T} \mathcal S_{with} \mathcal W) = (\mathcal W^{T}\mathcal S_{bet}\mathcal W)\mathcal S_{with}\mathcal W 
        
       
     SbetW(WTSwithW)−(WTSbetW)SwithW=0SbetW(WTSwithW)=(WTSbetW)SwithW 观察: 
     
      
       
        
        
          W 
         
        
          T 
         
        
        
        
          S 
         
         
         
           w 
          
         
           i 
          
         
           t 
          
         
           h 
          
         
        
       
         W 
        
       
      
        \mathcal W^{T}\mathcal S_{with}\mathcal W 
       
      
    WTSwithW和 
     
      
       
        
        
          W 
         
        
          T 
         
        
        
        
          S 
         
         
         
           b 
          
         
           e 
          
         
           t 
          
         
        
       
         W 
        
       
      
        \mathcal W^{T}\mathcal S_{bet}\mathcal W 
       
      
    WTSbetW它们的结果均是标量,即常数;因此最优参数 
     
      
       
        
        
          W 
         
        
          ^ 
         
        
       
      
        \hat {\mathcal W} 
       
      
    W^可表示为如下形式:  
     
      
       
        
        
          W 
         
        
          T 
         
        
       
      
        \mathcal W^{T} 
       
      
    WT维度 
     
      
       
       
         1 
        
       
         × 
        
       
         p 
        
       
      
        1 \times p 
       
      
    1×p; 
     
      
       
        
        
          S 
         
         
         
           w 
          
         
           i 
          
         
           t 
          
         
           h 
          
         
        
       
         , 
        
        
        
          S 
         
         
         
           b 
          
         
           e 
          
         
           t 
          
         
        
       
      
        \mathcal S_{with},\mathcal S_{bet} 
       
      
    Swith,Sbet维度均是 
     
      
       
       
         p 
        
       
         × 
        
       
         p 
        
       
      
        p \times p 
       
      
    p×p; 
     
      
       
       
         W 
        
       
      
        \mathcal W 
       
      
    W维度 
     
      
       
       
         p 
        
       
         × 
        
       
         1 
        
       
      
        p \times 1 
       
      
    p×1;  
      
       
        
         
         
           W 
          
         
           ^ 
          
         
        
          = 
         
         
          
           
           
             W 
            
           
             T 
            
           
           
           
             S 
            
            
            
              w 
             
            
              i 
             
            
              t 
             
            
              h 
             
            
           
          
            W 
           
          
          
           
           
             W 
            
           
             T 
            
           
           
           
             S 
            
            
            
              b 
             
            
              e 
             
            
              t 
             
            
           
          
            W 
           
          
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
          
          
            − 
           
          
            1 
           
          
         
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          W 
         
        
       
         \hat {\mathcal W} = \frac{\mathcal W^{T}\mathcal S_{with} \mathcal W}{\mathcal W^{T}\mathcal S_{bet} \mathcal W} \mathcal S_{with}^{-1} \mathcal S_{bet} \mathcal W 
        
       
     W^=WTSbetWWTSwithWSwith−1SbetW
先观察分式项,由于分子、分母都是常数,因此该分式项也是一个常数,由于 
     
      
       
        
        
          W 
         
        
          ^ 
         
        
       
      
        \hat {\mathcal W} 
       
      
    W^本身就是一个向量,我们更关心向量的方向而不是向量的大小。因此通常忽略常数项(系数项)的影响, 
     
      
       
        
        
          W 
         
        
          ^ 
         
        
       
      
        \hat {\mathcal W} 
       
      
    W^可以表示为如下形式:  
      
       
        
         
         
           W 
          
         
           ^ 
          
         
        
          ∝ 
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
          
          
            − 
           
          
            1 
           
          
         
         
         
           S 
          
          
          
            b 
           
          
            e 
           
          
            t 
           
          
         
        
          W 
         
        
       
         \hat {\mathcal W} \propto \mathcal S_{with}^{-1} \mathcal S_{bet} \mathcal W 
        
       
     W^∝Swith−1SbetW 基于上式,将 
     
      
       
        
        
          S 
         
         
         
           b 
          
         
           e 
          
         
           t 
          
         
        
       
      
        \mathcal S_{bet} 
       
      
    Sbet(类间方差)带入并展开:  
      
       
        
         
         
           W 
          
         
           ^ 
          
         
        
          ∝ 
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
          
          
            − 
           
          
            1 
           
          
         
        
          ( 
         
         
          
          
            X 
           
           
           
             C 
            
           
             1 
            
           
          
         
           ˉ 
          
         
        
          − 
         
         
          
          
            X 
           
           
           
             C 
            
           
             2 
            
           
          
         
           ˉ 
          
         
        
          ) 
         
        
          ( 
         
         
          
          
            X 
           
           
           
             C 
            
           
             1 
            
           
          
         
           ˉ 
          
         
        
          − 
         
         
          
          
            X 
           
           
           
             C 
            
           
             2 
            
           
          
         
           ˉ 
          
         
         
         
           ) 
          
         
           T 
          
         
        
          W 
         
        
       
         \hat {\mathcal W} \propto \mathcal S_{with}^{-1}(\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})(\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})^{T}\mathcal W 
        
       
     W^∝Swith−1(XC1ˉ−XC2ˉ)(XC1ˉ−XC2ˉ)TW 观察后两项: 
     
      
       
       
         ( 
        
        
         
         
           X 
          
          
          
            C 
           
          
            1 
           
          
         
        
          ˉ 
         
        
       
         − 
        
        
         
         
           X 
          
          
          
            C 
           
          
            2 
           
          
         
        
          ˉ 
         
        
        
        
          ) 
         
        
          T 
         
        
       
         W 
        
       
      
        (\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})^{T}\mathcal W 
       
      
    (XC1ˉ−XC2ˉ)TW,同上, 
     
      
       
       
         ( 
        
        
         
         
           X 
          
          
          
            C 
           
          
            1 
           
          
         
        
          ˉ 
         
        
       
         − 
        
        
         
         
           X 
          
          
          
            C 
           
          
            2 
           
          
         
        
          ˉ 
         
        
        
        
          ) 
         
        
          T 
         
        
       
      
        (\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})^{T} 
       
      
    (XC1ˉ−XC2ˉ)T是一个 
     
      
       
       
         1 
        
       
         × 
        
       
         p 
        
       
      
        1 \times p 
       
      
    1×p的向量, 
     
      
       
       
         W 
        
       
      
        \mathcal W 
       
      
    W是 
     
      
       
       
         p 
        
       
         × 
        
       
         1 
        
       
      
        p \times 1 
       
      
    p×1的向量。因此, 
       
        
         
         
           ( 
          
          
           
           
             X 
            
            
            
              C 
             
            
              1 
             
            
           
          
            ˉ 
           
          
         
           − 
          
          
           
           
             X 
            
            
            
              C 
             
            
              2 
             
            
           
          
            ˉ 
           
          
          
          
            ) 
           
          
            T 
           
          
         
           W 
          
         
        
          (\bar {\mathcal X_{C_1}} - \bar{\mathcal X_{C_2}})^{T}\mathcal W 
         
        
      (XC1ˉ−XC2ˉ)TW也是一个标量、常数。如果要追究它的实际意义,可以理解为“各类样本均值的差距(或者称类间差距)在参考系 
       
        
         
         
           W 
          
         
        
          \mathcal W 
         
        
      W上的映射结果”。 系数依然不会影响向量的方向。因此,继续将上式化简为: 需要说明一下,这里的方向并不具体指向量的方向,而是‘向量所在直线的朝向’。系数 
     
      
       
        
         
          
          
            W 
           
          
            T 
           
          
          
          
            S 
           
           
           
             w 
            
           
             i 
            
           
             t 
            
           
             h 
            
           
          
         
           W 
          
         
         
          
          
            W 
           
          
            T 
           
          
          
          
            S 
           
           
           
             b 
            
           
             e 
            
           
             t 
            
           
          
         
           W 
          
         
        
       
      
        \frac{\mathcal W^{T}\mathcal S_{with}\mathcal W}{\mathcal W^{T}\mathcal S_{bet}\mathcal W} 
       
      
    WTSbetWWTSwithW和 
     
      
       
       
         ( 
        
        
         
         
           X 
          
          
          
            C 
           
          
            1 
           
          
         
        
          ˉ 
         
        
       
         − 
        
        
         
         
           X 
          
          
          
            C 
           
          
            2 
           
          
         
        
          ˉ 
         
        
        
        
          ) 
         
        
          T 
         
        
       
         W 
        
       
      
        (\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}})^{T}\mathcal W 
       
      
    (XC1ˉ−XC2ˉ)TW正、负都有可能,但无论其结果是正还是负,乘以该系数对应的向量所在直线不会发生变化。  
      
       
        
         
         
           W 
          
         
           ^ 
          
         
        
          ∝ 
         
         
         
           S 
          
          
          
            w 
           
          
            i 
           
          
            t 
           
          
            h 
           
          
          
          
            − 
           
          
            1 
           
          
         
        
          ( 
         
         
          
          
            X 
           
           
           
             C 
            
           
             1 
            
           
          
         
           ˉ 
          
         
        
          − 
         
         
          
          
            X 
           
           
           
             C 
            
           
             2 
            
           
          
         
           ˉ 
          
         
        
          ) 
         
        
       
         \hat {\mathcal W} \propto \mathcal S_{with}^{-1}(\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}}) 
        
       
     W^∝Swith−1(XC1ˉ−XC2ˉ) 换句话说,最优参考系 
      
       
        
         
         
           W 
          
         
           ^ 
          
         
        
       
         \hat {\mathcal W} 
        
       
     W^的方向只和向量 
       
        
         
          
          
            S 
           
           
           
             w 
            
           
             i 
            
           
             t 
            
           
             h 
            
           
           
           
             − 
            
           
             1 
            
           
          
         
           ( 
          
          
           
           
             X 
            
            
            
              C 
             
            
              1 
             
            
           
          
            ˉ 
           
          
         
           − 
          
          
           
           
             X 
            
            
            
              C 
             
            
              2 
             
            
           
          
            ˉ 
           
          
         
           ) 
          
         
        
          \mathcal S_{with}^{-1}(\bar {\mathcal X_{C_1}} - \bar {\mathcal X_{C_2}}) 
         
        
      Swith−1(XC1ˉ−XC2ˉ)的方向相关,因此,上式为基于二分类的线性判别分析最优参考系(线性模型的最优模型参数) 
      
       
        
         
         
           W 
          
         
           ^ 
          
         
        
       
         \hat {\mathcal W} 
        
       
     W^的解。
相关参考: 机器学习-线性分类4-线性判别分析(模型求解)

 
                 
    