array, matrix, list and dataframe

总结一下“入门3R”（Reading, ‘Riting, ‘Rrithmetic）中的读和写，不同的数据结构下的读写还是有点区别的。

vector

命名

1 2	month.days<-c(31,28,31,30,31,30,31,31,30,31,30,31) names(month.days)<-month.name

操作文本

1.文本分离

1 2	pangram<-"The quick brown fox jumps over the lazy dog" strsplit(pangram," ")

strplit()函数将pangram用空格切开，这个函数的返回值是list

1	words<-strsplit(pangram," ")[[1]]

可以取出字符串数组

2.文本连接

1 2	paste(LETTERS[1:5],1:5,sep="_",collapse="---") paste("Sample",1:5)

用空格连接words中的元素，paste()接收的参数应该是多个变量，sep决定多个向量之间的连接符，而collapse决定统一向量中的元素怎么合并。

3.文本排序

1	sort(letters,decreasing=TRUE)

4.查找文本

1 2	substr(state.name,start=3,stop=6) grep("New",state.name)####通过模式查找

grep(pattern,x)返回的是符合pattern的元素的在x中的位置

5.文本替换

1	gsub("cheap","sheep's","A wolf in cheap clothing")

1 2	x<-c("file_a.csv","file_b.csv","file_c.csv") y<-gsub("file_","",x)

因子分类

factor(x,levels,labels)可以创建R因子，而levels指的是x的输入值，labels表示新创建的因子的输出值。

因子转换

numbers<-factor(c(9,8,10,8,9))
str(numbers)
as.character(numbers)###返回字符型元素
as.numeric(numbers)###返回因子的内部表示
as.numeric(as.character(numbers))###返回数值型元素

有序因子

类别数据的统计

1	table(state.region)

有序变量

使用factor()函数，并且指定参数ordered=TRUE
使用ordered()函数

matrix

matrix(data,ncol,nrow,byrow)
dim()###查看矩阵维度
rbind()###将向量按行组成矩阵
cbind()###将向量按列组成矩阵
cbind(1:3,4:6,matrix(7:12,ncol=2))

索引、修改和命名

first.matrix<-matrix[1:12,ncol=4,byrow=TRUE]
#############取值
first.matrix[1:2,2:3] 
first.matrix[2:3,]###数值索引
first.matrix[-2,-3]###提取除了第2行，第3列外全部数据
first.matrix[-c(1,3),]###维度降低成向量
first.matrix[2, ,drop=大专栏  array, matrix, list and dataframeeral">FALSE]###维度不降低，仍是矩阵
#############修改
first.matrix[3,2]<-4
first.matrix[2,]<-c(1,3)
first.matrix[1:2,3:4]<-c(8,4,2,1)
#############行列命名
rownames(x)<-c('a', 'b')
colnames(x)<-c('c', 'd')
colnames(x)[1]<-'aa'
x['b',]###用名称作为索引

计算

1
2
3

t()###转置
solve()###求逆
x %*% t(x)###相乘

array

向量和矩阵都是数组.

1 2	array(1:24,dim=c(3,4,2))###创建一个三维数组 dim(x)<-c(3,4,2)###改变向量x的维度

data.frame

由矩阵创建 x.df<-as.data.frame(x)

由向量创建 data<-data.frame(x,y,z)

如果创建的变量是字符串类型，R会自动转换成因子，可以用stringAsFactor=FALSE保持字符串类型

1 2	names(data)[2]<-'B' ###命名表头 rownames(data)<- c('a','b','c') ###命名观测

操作data.frame中的值

data.frame并不是向量，而是一组向量列表。但是数据操作时可以当做矩阵来处理，访问单个变量时可以用$，访问多个变量时可以用[]

#########修改值
y<-rbind(x,new.obs) ###添加单个观测
y<-rbind(x,'d'=new.obs) ###显式制定行名

new.obs<-data.frame(A=c(1,2),B=c(2,3))
rownames<(new.obs)<-c('e','f')
y<-rbind(x,new.obs) ###添加多个观测

x[c('e','f'),]<-matrix(c(1,1,2,4),ncol=2) ###使用索引添加多个值

##########修改变量
x$C<-new.var ###添加一个变量
new.df<-data.frame(newvar1,newvar2)
x<-cbind(x,new.df) ###添加多个变量

list

#######创建list
new.list<-list(x,y)###无命名列表
new.nlist<-list(name1=x,name2=y)###命名列表
names(new.nlist)###获取列表名称
length(new.list)###获取列表长度

########提取列表中的元素
###

提取列表中的元素

使用[[]]返回元素本身
使用[]返回选定元素的列表

#########修改元素值
new.nlist[[1]]<-x
new.nlist[['name1']]<-x
new.nlist$name1<-x
new.nlist[1]<-list(x)
new.nlist[1:2]<-list(x,y)

##########移除元素
new.nlist[[1]]<-NULL
new.nlist[['name1']]<-NULL
new.nlist$name1<-NULL
new.nlist[1]<-list(NULL)

##########添加元素
new.nlist$name3<-z
new.nlist[['name3']]<-z
new.nlist['name3']<-list(z)

##########列表合成
z<-list(z)
c(new.nlist,z)