转自：https://blog.csdn.net/yc_1993/article/details/52997074

2016年11月01日 16:19:52

阅读数：13352

Preface
开了很多题，手稿都是写好一直思考如何放到CSDN上来，一方面由于公司技术隐私，一方面由于面向对象不同，要大改，所以一直没贴出完整，希望日后可以把开的题都补充全。

先把大纲列出来：

一、从狄多公主圈地传说说起
二、谱聚类的演算
  （一）、演算
      1、谱聚类的概览
      2、谱聚类构图
      3、谱聚类切图
        （1）、RatioCut
        （2）、Ncut
        （3）、一点题外话
  （二）、pseudo-code
三、谱聚类的实现(scala)
  （一）、Similarity Matrix
  （二）、kNN/mutual kNN
  （三）、Laplacian Matrix
  （四）、Normalized
  （五）、Eigenvector(Jacobi methond)
  （六）、kmeans/GMM
四、一些参考文献

一、从狄多公主圈地传说说起

       谱聚类（spectral clustering）的思想最早可以追溯到一个古老的希腊传说，话说当时有一个公主，由于其父王去世后，长兄上位，想独揽大权，便杀害了她的丈夫，而为逃命，公主来到了一个部落，想与当地的酋长买一块地，于是将身上的金银财宝与酋长换了一块牛皮，且与酋长约定只要这块牛皮所占之地即可。聪明的酋长觉得这买卖可行，于是乎便答应了。殊不知，公主把牛皮撕成一条条，沿着海岸线，足足围出了一个城市。
       故事到这里就结束了，但是我们要说的才刚刚开始，狄多公主圈地传说，是目前知道的最早涉及Isoperimetric problem（等周长问题）的，具体为如何在给定长度的线条下围出一个最大的面积，也可理解为，在给定面积下如何使用更短的线条，而这，也正是谱图聚类想法的端倪，如何在给定一张图，拿出“更短”的边来将其“更好”地切分。而这个“更短”的边，正是对应了spectral clustering中的极小化问题，“更好”地切分，则是对应了spectral clustering中的簇聚类效果。
       谱聚类最早于1973年被提出，当时Donath 和 Hoffman第一次提出利用特征向量来解决谱聚类中的f向量选取问题，而同年，Fieder发现利用倒数第二小的特征向量，显然更加符合f向量的选取，同比之下，Fieder当时发表的东西更受大家认可，因为其很好地解决了谱聚类极小化问题里的NP-hard问题，这是不可估量的成就，虽然后来有研究发现，这种方法带来的误差，也是无法估量的，下图是Fielder老爷子，于去年15年离世，缅怀。

二、谱聚类的演算

（一）、演算

1、谱聚类概览

谱聚类演化于图论，后由于其表现出优秀的性能被广泛应用于聚类中，对比其他无监督聚类（如kmeans），spectral clustering的优点主要有以下：

1.过程对数据结构并没有太多的假设要求，如kmeans则要求数据为凸集。
2.可以通过构造稀疏similarity graph，使得对于更大的数据集表现出明显优于其他算法的计算速度。
3.由于spectral clustering是对图切割处理，不会存在像kmesns聚类时将离散的小簇聚合在一起的情况。
4.无需像GMM一样对数据的概率分布做假设。

同样，spectral clustering也有自己的缺点，主要存在于构图步骤，有如下：

1.对于选择不同的similarity graph比较敏感（如 epsilon-neighborhood， k-nearest neighborhood，fully connected等）。
2.对于参数的选择也比较敏感（如 epsilon-neighborhood的epsilon，k-nearest neighborhood的k，fully connected的 ）。

       谱聚类过程主要有两步，第一步是构图，将采样点数据构造成一张网图，表示为G(V,E)，V表示图中的点，E表示点与点之间的边，如下图：

                            图1 谱聚类构图(来源wiki)
       第二步是切图，即将第一步构造出来的按照一定的切边准则，切分成不同的图，而不同的子图，即我们对应的聚类结果，举例如下：
               切图4
                            图2 谱聚类切图
       初看似乎并不难，但是…，下面详细说明推导。

2、谱聚类构图

在构图中，一般有三种构图方式：
1. $ε$

W i, j = {0,

可以看出，在

ε

W i, j = W j, i = ⎧⎩⎨ 0,

ε

W i, j = W j, i = ⎧⎩⎨ 0,

ε

D i, j = {0,

ε

3、谱聚类切图

谱聚类切图存在两种主流的方式：RatioCut和Ncut，目的是找到一条权重最小，又能平衡切出子图大小的边，下面详细说明这两种切法。
在讲解RatioCut和Ncut之前，有必要说明一下问题背景和一些概念，假设V为所有样本点的集合， ${A_{1}, A_{2}, \dots, A_{k}}$

c u t (A 1, A 2, \dots, A k) = 1 2 \sum i k W ( A i ,

ε

W (A i, A i ¯) = \sum m \in A i, n \in A i ¯ w m

ε

m i n

ε

R a t i o c u t (A 1, A 2, \dots, A k) = 1 2 \sum i k W (

ε

N c u t (A 1, A 2, \dots, A k) = 1 2 \sum i k W ( A

ε

(1).Ratiocut

Ratiocut切图考虑了目标子图的大小，避免了单个样本点作为一个簇的情况发生，平衡了各个子图的大小。Ratiocut的目标同样是极小化各子图连边和，如下：

m i n

ε

h j, i = ⎧⎩⎨ 1 | A j |\sqrt ,

ε

h T i L h i

ε

h T i L h i = 1 2 ( \sum m

ε

h T i L h i

ε

R a t i o c u t (A 1, A 2, \dots, A k)

ε

a r g m i n H

ε

H * i, j = H i , j ( \sum k j = 1 H 2 i , j ) 1

ε

(2).Ncut

Ncut切法实际上与Ratiocut相似，但Ncut把Ratiocut的分母 $| A_{i} |$

m i n

ε

h j, i = ⎧⎩⎨ 1 v o l ( A i )\sqrt ,

ε

h T i L h i

ε

h T i D h i

ε

N c u t (A 1, A 2, \dots, A k)

ε

a r g m i n H

ε

H T L H = (D - 1 / 2 F) T L D - 1

ε

H T D H = (D - 1

ε

a r g m i n H

ε

F * i, j = F i , j ( \sum k j = 1 F 2 i , j ) 1

ε

(3).一点题外话

       写到这里，如果只是应用spectral clustering，则此部分可以忽略，直接看下文pseudo-code部分即可，但是对于喜欢深入探究，不妨看一看。
       值得一提的是，从概率的视角出发，与上文推导也是不谋而合，而且得到的结论，与Ncut更是异曲同工。这种概率视角在多数论文里，称之为随机游走(Random walks)，在随机数学里，常见于马尔可夫模型。这部分的详细出处可以参考Lovaszl(1993: Random Walks on Graphs: A Survey)，以及Meila & Shi (2001:A Random Walks Views of Spectral Segmentation)。
       在随机游走框架下，通常都会构建一个转移概率矩阵(transition matrix)，同样，利用上文的邻接矩阵，可以得到该转移概率矩阵P为：

P i, j = w i, j / D i, j,

ε

π i = D i , i v o l ( V )

ε

m i n 1 2 ( P ( A 1 | A 2 ) + P ( A 2 | A 1 ) )

ε

P (A 1 | A 2) = P ( A 1 , A 2 P ( A 2 )

ε

P (A 2) = v o l ( A 2 ) v o l ( A )

ε

P (A 1, A 2) = \sum i \in A

ε

P (A 1 | A 2) = 1 v o l ( V ) \sum i \in A

ε

m i n 1 2 ( P ( A 1 | A 2 ) + P ( A 2 | A 1 ) )

ε

（二）、pesudo-code

       Spectral clustering的pseudo-code有很多种，这里只讲最常用的normalized版本，也就是Ncut作为切图法的版本。
       具体为：
       1.构造S矩阵(similarity matrix)，同时指定要聚类的簇数k；
       2.利用S矩阵构造W矩阵(adjacent matrix)；
       3.计算拉普拉斯矩阵L，其中L=D-W；
       4.对L矩阵标准化，即令；
       5.计算normalize后的L矩阵的前k个特征向量，按特征值升序排列；
       6.对k个向量组成的矩阵的行进行看kmeans/GMM聚类；
       7.将聚类结果的各个簇分别打上标记，对应上原数据，输出结果。

input: (data,K,k,kNNType,sigma,epsilon)
1. S = Euclidean(data) 
2. W = Gaussian( kNN(S,k,kNNType) , sigma ) 
3. L = D - W
4. L' = normalized(L)
5. EV = eigenvector(L',K)
6. while( (newCenter - oldCenter) > epsilon){
      newCenter = kmeans(EV,K)
   }
output：K clusters of SC

其中，data为样本，对应code为二维数组，K为要聚类的簇数，k为kNN的邻接个数，kNNType为SC中的kNN函数类型，一般为kNN/mutalkNN，具体看上文说明，sigma为构造W矩阵时的高斯函数参数，epsilon为kmeans或者GMMs中的更新步长。SC输出为聚出K个簇。

三、谱聚类的实现

实现过程涉及到的一些概念有：Similarity Matrix、kNN/mutual kNN、Laplacian Matrix、Normalized、Eigenvector(Jacobi methond)、kmeans/GMM；下面一一按序解析，使用语言为scala，这里需要说明一点，由于技术保密，这里有些东西只能介绍一些简单的版本，动手操作可以发现，上述计算在小样本还算过得去，样本一大简直无法入目，只能各位自己多去查看论文了，若日后有缘，会开篇做另外的优化阐述。

（一）、Similarity Matrix

// calculate the similarity matrix
def calculateSimilarityMatrix(SCInput: Array[Array[Double]]): Array[Array[Double]] = {
    // the Euclidean Distance
    def SCEuclideanDistance(SCEDInput: Array[Double]): Array[Double] = {
        SCInput.map(_.zip(SCEDInput))
               .map(a => a.map(b => math.pow(b._1 - b._2, 2)).sum) } SCInput.map(SCEuclideanDistance) }

       这里不代入数据了，直接构造一个similarity Matrix了，如下：

                     图4 similarity Matrix

（二）、kNN/mutual kNN

下面是对上面similarity matrix一步构造的相似图做调整，将其转化为W（adjacency matrix）.

    // define the kNN function
    def kNN(k: Int, kNNType: String, sigma: Double = 1.0, SMatrix: Array[Array[Double]]): Array[Array[Double]] = {
        val len = SMatrix.length
        val AdjacencyMatrix = Array.ofDim[Double](len, len) // define the function of calculating the mutual adjacency matrix (for mutualkNN) @tailrec def calculateMutualAdjacencyMatrix(n: Int, S: Array[Array[Double]]): Array[Array[Double]] = { if (n < len) { // take out the smallest k values val kSmallestValue = S(n).zipWithIndex.sortWith((a, b) => a._1 < b._1).take(k + 1) val indexOfValue = kSmallestValue.map(_._2).distinct // calculate the Gaussian similarity value for (i <- indexOfValue) { val GaussianSimilarity = math.exp(-S(n)(i) / 2 / sigma / sigma) AdjacencyMatrix(n)(i) = GaussianSimilarity } calculateMutualAdjacencyMatrix(n + 1, S) } else { // judge mutual or not. for (i <- 0 until len) { val notZero = AdjacencyMatrix(i).zipWithIndex.filter(_._1 != 0.0).map(_._2) for (j <- notZero) if (AdjacencyMatrix(j)(i) == 0.0) AdjacencyMatrix(i)(j) = 0.0 } AdjacencyMatrix } } // define the function of calculating the mutual adjacency matrix (for kNN) @tailrec def calculateAdjacencyMatrix(n: Int, S: Array[Array[Double]]): Array[Array[Double]] = { if (n < len) { // take out the smallest k values val kSmallestValue = S(n).zipWithIndex.sortWith((a, b) => a._1 < b._1).take(k + 1) val indexOfValue = kSmallestValue.map(_._2).distinct // calculate the Gaussian similarity value for (i <- indexOfValue) { val GaussianSimilarity = math.exp(-S(n)(i) / 2 / sigma / sigma) AdjacencyMatrix(n)(i) = GaussianSimilarity AdjacencyMatrix(i)(n) = GaussianSimilarity } calculateAdjacencyMatrix(n + 1, S) } else { AdjacencyMatrix } } if (kNNType == "mutualkNN") { calculateMutualAdjacencyMatrix(0, SMatrix) } else { calculateAdjacencyMatrix(0, SMatrix) } }

       两种操作结果如下：kNN和mutualkNN。PS：由于是直接在IntelliJ直接截图，所以看起来可能没那么好看(ㄒoㄒ)
               kNN
                     图5 adjacency Matrix (kNN)

图6 adjacency Matrix (mutualkNN)

两个矩阵计算时k值均指定为2，sigma指定为1，可以观察出，上文矩阵 $[\begin{matrix} 0.0 & 8.0 & 7.0 & 5.0 \\ 8.0 & 0.0 & 6.0 & 10.0 \\ 7.0 & 6.0 & 0.0 & 3.0 \\ 5.0 & 10.0 & 3.0 & 0.0 \end{matrix}]$

（三）、 Laplacian Matrix

下面利用上面W（adjacency matrix）矩阵，进一步计算D（degree matrix）矩阵和L（laplacian matrix）矩阵.

    // define the function of calculating the Laplacian Matrix
    def calculateLaplacianMatrix(adjacencyMatrix: Array[Array[Double]]): Array[Array[Double]] = {
        val len = adjacencyMatrix.length
        val laplacianMatrix = Array.ofDim[Double](len, len)

        // define the function of calculating the degree matrix def calculateDegreeMatrix(AM: Array[Array[Double]]): Array[Array[Double]] = { val degreeMatrix = Array.ofDim[Double](len, len) for (i <- 0 until len) { degreeMatrix(i)(i) = AM(i).sum } degreeMatrix } val degreeMatrix = calculateDegreeMatrix(adjacencyMatrix) // calculate the Laplacian matrix for (i <- 0 until len) { laplacianMatrix(i) = degreeMatrix(i).zip(adjacencyMatrix(i)).map(a => a._1 - a._2) } laplacianMatrix }

       这里我把D的计算和L的计算写在一起，一步计算到位，结果如下：
               degreeMatrix
                     图7 degree Matrix

laplacianMatrix
图8 laplacian Matrix

（四）、 Normalized

下面进一步对上面L（laplacian matrix）矩阵进行正则化处理.

    // define the function of calculating the normalized Laplacian Matrix
    def Normalized(laplacianMatrix: Array[Array[Double]], adjacencyMatrix: Array[Array[Double]]): Array[Array[Double]] = {
        val len = adjacencyMatrix.length

        // define the function of calculate the -1/2 power of the degree matrix
        def calculateAdjustDegreeMatrix(AM: Array[Array[Double]]): Array[Array[Double]] = { val adjustDegreeMatrix = Array.ofDim[Double](len, len) for (i <- 0 until len) { adjustDegreeMatrix(i)(i) = 1 / math.pow(AM(i).sum, 0.5) } adjustDegreeMatrix } val adjustDegreeMatrix = calculateAdjustDegreeMatrix(adjacencyMatrix) // calculate the normalized laplacian matrix def matrixProduct(left: Array[Array[Double]], right: Array[Array[Double]]): Array[Array[Double]] = { val len = left.length val output = Array.ofDim[Double](len,len) for(i <- 0 until len; j <- i until len){ output(i)(j) = left(i).zip(right.map(_(j))).map(a => a._1 * a._2).sum output(j)(i) = output(i)(j) } output } val temp = matrixProduct(adjustDegreeMatrix,laplacianMatrix) val normalizedLaplacianMatrix = matrixProduct(temp,adjustDegreeMatrix) normalizedLaplacianMatrix }

       下面是上文L（laplacian matrix）矩阵的正则结果：

                     图9 Normalized laplacian Matrix

（五）、 Eigenvector(Jacobi methond)

下面进一步对上面正则化处理之后的L矩阵L’，取对应的特征向量，组成新的矩阵，特征向量的计算这里用的是串行的Jacobi旋转方法，内在逻辑就是对L’矩阵行列转换，得到L’的相似矩阵，且满足该相似矩阵为对角矩阵。值得一说的是，这种方法小样本可行，大样本效率很低，之后看情况，看是否再把优化的方法写出来，看缘分吧。。。

    // define the function of calculating the k smallest eigenvectors of normalized laplacian matrix with Jacobi method.
    def kSmallestEigenvectors(k: Int, normalizedLaplacian: Array[Array[Double]]): (Array[Double], Array[Array[Double]]) = {

        val len = normalizedLaplacian.length

        // initial the eigenvector matrix.
        val eigenvectorMatrix = Array.ofDim[Double](len, len) for (i <- 0 until len) eigenvectorMatrix(i)(i) = 1.0 // initial the parameter of epsilon. val epsilon = (math.pow(10, -10), -1) // calculate the largest one Of normalized laplacian matrix(off-diagonal). def calculateLargestOfNormL(Input: Array[Array[Double]]): (Double, Int) = { val temp = Input.map(_.zipWithIndex) val largestOfRow = new Array[(Double, Int)](len - 1) for (i <- 0 until (len - 1)) { largestOfRow(i) = temp(i).filter(_._2 > i).sortWith((a, b) => a._1.abs > b._1.abs).head } val largestOfInput = largestOfRow.sortWith((a, b) => a._1.abs > b._1.abs).head if (epsilon._1 > largestOfInput._1.abs) epsilon else largestOfInput } var largestOfNormL = calculateLargestOfNormL(normalizedLaplacian) // Judge condition. var loop: Boolean = true // main loop. while (loop) { // the index of the largest value of normalized laplacian matrix. val mRow = largestOfNormL._2 val mCol = normalizedLaplacian(largestOfNormL._2).indexOf(largestOfNormL._1) // calculate the new normalized Laplacian matrix: angle, sin, cos, new(row,col) // cache the temp value. val nL_ij = normalizedLaplacian(mRow)(mCol) val nL_ii = normalizedLaplacian(mRow)(mRow) val nL_jj = normalizedLaplacian(mCol)(mCol) val angle = if (nL_jj == nL_ii) math.Pi / 4.0 else 0.5 * math.atan2(2 * nL_ij, nL_jj - nL_ii) val sinAngle = math.sin(angle) val cosAngle = math.cos(angle) // update the normalized Laplacian matrix (ii, jj, ij, ji) normalizedLaplacian(mRow)(mRow) = nL_ii * cosAngle * cosAngle + nL_jj * sinAngle * sinAngle - 2 * nL_ij * cosAngle * sinAngle normalizedLaplacian(mCol)(mCol) = nL_ii * sinAngle * sinAngle + nL_jj * cosAngle * cosAngle + 2 * nL_ij * cosAngle * sinAngle normalizedLaplacian(mRow)(mCol) = (cosAngle * cosAngle - sinAngle * sinAngle) * nL_ij + sinAngle * cosAngle * (nL_ii - nL_jj) normalizedLaplacian(mCol)(mRow) = normalizedLaplacian(mRow)(mCol) for (i <- (0 until len).filter(a => a != mRow && a != mCol)) { val tempRi = normalizedLaplacian(mRow)(i) val tempCi = normalizedLaplacian(mCol)(i) normalizedLaplacian(mRow)(i) = cosAngle * tempRi - sinAngle * tempCi normalizedLaplacian(mCol)(i) = sinAngle * tempRi + cosAngle * tempCi normalizedLaplacian(i)(mRow) = normalizedLaplacian(mRow)(i) normalizedLaplacian(i)(mCol) = normalizedLaplacian(mCol)(i) } // update the eigenvector matrix. for (i <- 0 until len) { val eigenIR = eigenvectorMatrix(i)(mRow) val eigenIC = eigenvectorMatrix(i)(mCol) eigenvectorMatrix(i)(mRow) = eigenIR * cosAngle - eigenIC * sinAngle eigenvectorMatrix(i)(mCol) = eigenIC * cosAngle + eigenIR * sinAngle } // update the temp value again. largestOfNormL = calculateLargestOfNormL(normalizedLaplacian) // update the judge condition. if (largestOfNormL._2 == -1) loop = false } // return the k smallest eigenvalue and it's corresponding eigenvector matrix. val eigenvalue = new Array[Double](len) for (i <- 0 until len) eigenvalue(i) = normalizedLaplacian(i)(i) def kSmallest(eigenvalue: Array[Double], eigenvector: Array[Array[Double]]): (Array[Double], Array[Array[Double]]) = { val eigenvalueWithIndex = eigenvalue.zipWithIndex.sortWith((a, b) => a._1 < b._1).take(k) val ouput = Array.ofDim[Double](len, k) var n: Int = 0 for ((v, i) <- eigenvalueWithIndex) { for (j <- 0 until len) { ouput(j)(n) = eigenvector(j)(i) } n += 1 } (eigenvalueWithIndex.map(_._1), ouput) } val kSmallestValueVector = kSmallest(eigenvalue, eigenvectorMatrix) // final ouput. (kSmallestValueVector._1, kSmallestValueVector._2) }

为直观一点，下面拿矩阵 $[\begin{matrix} 0.0 & 8.0 & 7.0 & 5.0 \\ 8.0 & 0.0 & 6.0 & 10.0 \\ 7.0 & 6.0 & 0.0 & 3.0 \\ 5.0 & 10.0 & 3.0 & 0.0 \end{matrix}]$

图11 特征向量

结果可以和Matlab和R，python做比较，我只和R比对过，基本无差。

（六）、 kmeans/GMM

再做一步最终的聚类，便基本完成了算法。

    // define the kmeans function.
    def kmeans(K: Int, eigenvector: Array[Array[Double]]): Array[(Int, Array[Double])] = {

        val len = eigenvector.length
        val col = eigenvector(0).length // initial the random centers. var center: Map[Int, Array[Double]] = Map() var Karr: List[Int] = Nil while(Karr.length < K){ val RandomK = Random.nextInt(len) if(!Karr.contains(RandomK)) Karr = Karr ::: List(RandomK) } for (i <- 0 until K) { center += (i -> eigenvector(Karr(i))) } // classify the points into K clusters with the present center. def classify(ct: Map[Int, Array[Double]], input: Array[Array[Double]]): Array[(Int, Array[Double])] = { // calculate the euclidean distance. val tempArr = input.map(a => { val euclidean = new Array[Double](K) for (i <- 0 until K) { euclidean(i) = ct(i).zip(a).map(m => math.pow(m._1 - m._2, 2)).sum } euclidean.zipWithIndex }) // tagging the points. val tagging = tempArr.map(a => { val tag = a.sortWith((x, y) => x._1 < y._1).head tag._2 }) // output. val output = tagging.zip(input) output } //val pointsWithTag = classify(center, eigenvector) // update the center. def updateCenter(oldCenter: Map[Int, Array[Double]], PWT: Array[(Int, Array[Double])]): Map[Int, Array[Double]] = { // groupby the result for computing. val clusters = PWT.groupBy(_._1) // update the newCenter var newCenter: Map[Int, Array[Double]] = Map() for (i <- 0 until K) { val clustersI = clusters.get(i) match { case Some(s) => s.map(_._2) //case None => new Array(col) } val n = clustersI.length val centerI = for (j <- 0 until col) yield clustersI.map(_ (j)).sum / n newCenter += (i -> centerI.toArray) } newCenter } //val newCenter = updateCenter(center,pointsWithTag) // initialize the blank variable. var pointsWithTag: Array[(Int, Array[Double])] = Array() var newCenter: Map[Int, Array[Double]] = Map() var movement: Seq[Double] = Seq() // loop. var loop: Boolean = true var j = 0 while (loop) { j += 1 // tagging the points and update the center. pointsWithTag = classify(center, eigenvector) newCenter = updateCenter(center, pointsWithTag) // the movement of the center. movement = for (i <- 0 until K) yield newCenter(i).zip(center(i)).map(a => (a._1 - a._2).abs).sum // judge the movement is small enough or not. movement.exists(_ > math.pow(10, -10)) if (movement.exists(_ > math.pow(10, -5))) { center = newCenter } else { loop = false } } pointsWithTag //newCenter //j }

按照上文pesudo-code的指引，一步一步操作，最终输出结果如下，spectral clustering和kmeans的对比（twocircles datase）：

图12 spectral clustering和kmeans的对比

四、一些参考资料

Meila,shi: A Random Walks View of Spectral Segmentation
Harry Yserentant: A short theory of the Rayleigh-Ritz method
Ulrike von Luxburg: A Tutorial on Spectral Clustering
Andrew Y.Ng, Michael I.Jordan, Yair Weiss: On Spectral Clustering Analysis and an algorithm
L. LOVASZ: Random Walks on Graphs: A Survey
Fan R.K. Chung: Spectral Graph Theory
Xiao-Dong Zhang: The Laplacian eigenvalues of graphs: a survey
Bojan Mohar: THE LAPLACIAN SPECTRUM OF GRAPHS
pluskid: http://blog.pluskid.org/?p=287
Wiki: https://en.wikipedia.org/wiki/Jacobi_eigenvalue_algorithm
G. E. FORSYTHE AND P. HENRICI：THE CYCLIC JACOBI METHOD FOR COMPUTING THE PRINCIPAL VALUES OF A COMPLEX MATRIX
Jacobi Transformations of a Symmetric Matrix

转：谱聚类（spectral clustering)及其实现详解