版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/EverestRs/article/details/84179385
有如下数据:
> A<-c('Y','Y','N','N','Y')
> B<-c('N','Y','Y','Y','N')
> C<-c('Y','Y','Y','Y','N')
> D<-c('Y','Y','y','y','Y')
> E<-c('N','n','n','Y','N')
> F<-c('Y','n','Y','Y','y')
> test<-rbind(A,B,C,D,E,F)
> test
[,1] [,2] [,3] [,4] [,5]
A "Y" "Y" "N" "N" "Y"
B "N" "Y" "Y" "Y" "N"
C "Y" "Y" "Y" "Y" "N"
D "Y" "Y" "y" "y" "Y"
E "N" "n" "n" "Y" "N"
F "Y" "n" "Y" "Y" "y"
将Y和y变为数字1,将N和n变为数字0
> test[which(test=='Y')]=1
> test[which(test=='y')]=1
> test
[,1] [,2] [,3] [,4] [,5]
A "1" "1" "N" "N" "1"
B "N" "1" "1" "1" "N"
C "1" "1" "1" "1" "N"
D "1" "1" "1" "1" "1"
E "N" "n" "n" "1" "N"
F "1" "n" "1" "1" "1"
> test[which(test=='N')]=0
> test[which(test=='n')]=0
> test
[,1] [,2] [,3] [,4] [,5]
A "1" "1" "0" "0" "1"
B "0" "1" "1" "1" "0"
C "1" "1" "1" "1" "0"
D "1" "1" "1" "1" "1"
E "0" "0" "0" "1" "0"
F "1" "0" "1" "1" "1"
计算距离
R语言中计算距离的函数为dist()
如:dist(x, method = "euclidean", diag = FALSE, upper = FALSE, p = 2)
#x是样本矩阵或者数据框。
method表示计算哪一种距离:
euclidean 欧几里德距离,就是平方再开方。
maximum 切比雪夫距离
manhattan 绝对值距离
canberra Lance 距离
minkowski 明科夫斯基距离,使用时要指定p值
binary 定性变量距离.
> distdata<-dist(test,method = "binary")
> distdata
A B C D E
B 0.8000000
C 0.6000000 0.2500000
D 0.4000000 0.4000000 0.2000000
E 1.0000000 0.6666667 0.7500000 0.8000000
F 0.6000000 0.6000000 0.4000000 0.2000000 0.7500000