统计和大数据 latent-semantic-indexing

在LSI中了解奇异值分解

我的问题通常是关于奇异值分解（SVD），尤其是关于潜在语义索引（LSI）。假设我有一个，其中包含7个文档中5个单词的频率。Aword×documentAword×document A_{word \times document} A = matrix(data=c(2,0,8,6,0,3,1, 1,6,0,1,7,0,1, 5,0,7,4,0,5,6, 7,0,8,5,0,8,5, 0,10,0,0,7,0,0), ncol=7, byrow=TRUE) rownames(A) <- c('doctor','car','nurse','hospital','wheel') 我得到的矩阵分解通过使用SVD：。AAAA=U⋅D⋅VTA=U⋅D⋅VTA = U \cdot D \cdot V^T s = svd(A) D = diag(s$d) # singular value matrix S = diag(s$d^0.5 ) # diag matrix with square roots of singular values. 在1和2中指出： WordSim=U⋅SWordSim=U⋅SWordSim = …

9 r svd natural-language latent-semantic-indexing

Questions tagged «latent-semantic-indexing»