python合并dataframe对象

每次分析数据过程中，总是会遇到各种问题，一时间想不起来。都说事不过三，这事出了不少次数了，这里还是记下来，但是可能不那么全，记下来也仅仅为了以后本人自己需要。
这里声明如下三个dataframe

 b = np.random.random((3,2))
 A = pd.DataFrame(b,columns=['A1','A2'])
 
 c = np.random.random((3,2))
 C = pd.DataFrame(c,columns=['C1','C2'])
 
 B = pd.DataFrame(np.random.random((3,2)),columns=['C1','C2'])

对应的有：

 A
Out[9]:
         A1        A2
0  0.193384  0.088973
1  0.013379  0.381474
2  0.975780  0.431396

 B
Out[18]:
         C1        C2
0  0.181912  0.312157
1  0.760391  0.082399
2  0.313043  0.625784

C
Out[12]:
         C1        C2
0  0.465802  0.646758
1  0.383527  0.113343
2  0.282318  0.870743

注意到，三个DataFrame都是三行两列数据，其中B、C的列标签相同。

1.dataframe取行

i).取单行

有三种方式：
1.B.iloc[[i]]取第i行

In [20]: B.iloc[[1]]
Out[20]:
         C1        C2
1  0.760391  0.082399

2.B.iloc[i]取第i行，与第一种方式的返回结果稍有区别

In [21]: B.iloc[1]
Out[21]:
C1    0.760391
C2    0.082399
Name: 1, dtype: float64

3.B.ix[i]取第i行

In [22]: B.ix[1]
D:\Anaconda3\Scripts\ipython:1: DeprecationWarning:
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated
Out[22]:
C1    0.760391
C2    0.082399
Name: 1, dtype: float64

ii).取多行

1.取连续的多行
I.可以使用类似于列表切片的方式

B[:-1]
Out[23]:
         C1        C2
0  0.181912  0.312157
1  0.760391  0.082399

II可以使用iloc[]高级切片

 B.iloc[-1:,:]
Out[25]:
         C1        C2
2  0.313043  0.625784

2.dataframe取列

取单列

1.使用列名取列

 B['C1']
Out[26]:
0    0.181912
1    0.760391
2    0.313043
Name: C1, dtype: float64

2.使用iloc取

 B.iloc[:,:-1]
Out[24]:
         C1
0  0.181912
1  0.760391
2  0.313043

上面的形式可以描述为:第一个:前面为空代表取所有行
3。

上面写了半天，感觉有点浪费时间 —

3.dataframe合并

前面的三个dataframe中，其中B、C的列标签相同。

i)相同列索引

 B.append(C)
Out[27]:
         C1        C2
0  0.181912  0.312157
1  0.760391  0.082399
2  0.313043  0.625784
0  0.465802  0.646758
1  0.383527  0.113343
2  0.282318  0.870743

注意到：行索引值有重复，这里须使用B.append(C).reset_index(drop=True),重新更新索引。

ii)相同的行索引

在行索引值也就是index值相同时，如这里的A、B、C都是行索引相同的dataframe,应该如何合并呢？？？注意到B和C的行索引值和列索引值都相同，所以没有办法使用join合并

 B.join(A)
Out[33]:
         C1        C2        A1        A2
0  0.181912  0.312157  0.193384  0.088973
1  0.760391  0.082399  0.013379  0.381474
2  0.313043  0.625784  0.975780  0.431396

In [34]: C.join(A)
Out[34]:
         C1        C2        A1        A2
0  0.465802  0.646758  0.193384  0.088973
1  0.383527  0.113343  0.013379  0.381474
2  0.282318  0.870743  0.975780  0.431396

上面这俩足够应付我现在遇到的内容了。

参考文章：

1.python中dataframe常见操作：取行、列、切片、统计特征值
2.pandas dataframe合并数据（append、merge、concat）