Pandas 数据拼接/排序/重置

一、数据拼接

1.1 行拼接（纵向，第 0 维）pd.concat([df1,df2])
1.2 列拼接（横向，第 1 维）pd.concat([df1,df2], axis=1)

二、排序

2.1 从小到大排序（默认）df.sort_values()
2.2 从大到小排序（逆序）df.sort_values(ascending =False)
2.3 对 index 进行排序 df.sort_index()
2.4 对 index 进行重置（变成默认的0～n）df.reset_index()

三、对列标签重命名 df.rename()

一、数据拼接

准备数据：

import pandas as pd

d_1 = {'name' : pd.Series(['a', 'b', 'c', 'd'], index=[0,1,2,3]),
   'attri_1' : pd.Series([1.1, 2.2, 3.3, 4.4,5.5], index=[0,1,2,3,4]),
     'attri_2' : pd.Series([1, 2, 3, 4, 5], index=[0,1,2,3,4])
     }

d_2 = {'name' : pd.Series(['aa', 'bb', 'cc', 'dd','ee'], index=[0,1,2,3,4]),
   'attri_1' : pd.Series([2.1, 3.2, 4.3, 5.4, 6.5], index=[0,1,2,3,4]),
     'attri_2' : pd.Series([1, 2, 3, 4, 5], index=[0,1,2,3,4])
     }

df_1 = pd.DataFrame(d_1)

df_2 = pd.DataFrame(d_2)

print(df_1)
print(df_2)

运行结果：

  name  attri_1  attri_2
0    a      1.1        1
1    b      2.2        2
2    c      3.3        3
3    d      4.4        4
4  NaN      5.5        5
  name  attri_1  attri_2
0   aa      2.1        1
1   bb      3.2        2
2   cc      4.3        3
3   dd      5.4        4
4   ee      6.5        5

1.1 行拼接（纵向，第 0 维）pd.concat([df1,df2])

df_3 = pd.concat([df_1,df_2])
print(df_3)

运行结果：

  name  attri_1  attri_2
0    a      1.1        1
1    b      2.2        2
2    c      3.3        3
3    d      4.4        4
4  NaN      5.5        5
0   aa      2.1        1
1   bb      3.2        2
2   cc      4.3        3
3   dd      5.4        4
4   ee      6.5        5

注意事项：

这里的两组数据列标签需要一致
拼接后的 index 还是保留原先的

1.2 列拼接（横向，第 1 维）pd.concat([df1,df2], axis=1)

df_4 = pd.concat([df_1,df_2], axis=1)
print(df_4)

运行结果：

  name  attri_1  attri_2 name  attri_1  attri_2
0    a      1.1        1   aa      2.1        1
1    b      2.2        2   bb      3.2        2
2    c      3.3        3   cc      4.3        3
3    d      4.4        4   dd      5.4        4
4  NaN      5.5        5   ee      6.5        5

二、排序

准备数据：

d = {'name' : pd.Series(['a', 'b', 'c', 'd'], index=[0,1,2,3]),
   'attri_1' : pd.Series([5.1, 4.2, 2.3, 1.4,3.5], index=[0,1,2,3,4]),
     'attri_2' : pd.Series([1, 2, 3, 4, 5], index=[0,1,2,3,4])
     }

df = pd.DataFrame(d)

print(df)

运行结果：

  name  attri_1  attri_2
0    a      5.1        1
1    b      4.2        2
2    c      2.3        3
3    d      1.4        4
4  NaN      3.5        5

2.1 从小到大排序（默认）df.sort_values()

print(df.sort_values('attri_1')) # 排序后，index 也会相应变化

运行结果：

  name  attri_1  attri_2
3    d      1.4        4
2    c      2.3        3
4  NaN      3.5        5
1    b      4.2        2
0    a      5.1        1

2.2 从大到小排序（逆序）df.sort_values(ascending =False)

print(df.sort_values('attri_1',ascending =False))
print(df) # 这里可以看到，排序不会直接作用于原数据，所以这里打印出来的还是原来的形式

运行结果：

  name  attri_1  attri_2
0    a      5.1        1
1    b      4.2        2
4  NaN      3.5        5
2    c      2.3        3
3    d      1.4        4
  name  attri_1  attri_2
0    a      5.1        1
1    b      4.2        2
2    c      2.3        3
3    d      1.4        4
4  NaN      3.5        5

2.3 对 index 进行排序 df.sort_index()

df_5 = df.sort_values('attri_1')

print(df_5)
print(df_5.sort_index())

运行结果：

  name  attri_1  attri_2
3    d      1.4        4
2    c      2.3        3
4  NaN      3.5        5
1    b      4.2        2
0    a      5.1        1
  name  attri_1  attri_2
0    a      5.1        1
1    b      4.2        2
2    c      2.3        3
3    d      1.4        4
4  NaN      3.5        5

2.4 对 index 进行重置（变成默认的0～n）df.reset_index()

d = {
   'attri_1' : pd.Series([5.1, 4.2, 2.3, 1.4,3.5], index=['a', 'b', 'c', 'd', 'e'])
     }
df = pd.DataFrame(d)

print(df)
print(df.reset_index()) # 这里重置了 index ，把原来的 index 变为列，标签为 'index'

运行结果：

   attri_1
a      5.1
b      4.2
c      2.3
d      1.4
e      3.5
  index  attri_1
0     a      5.1
1     b      4.2
2     c      2.3
3     d      1.4
4     e      3.5

三、对列标签重命名 df.rename()

d = {
   'attri_1' : pd.Series([5.1, 4.2, 2.3, 1.4,3.5], index=['a', 'b', 'c', 'd', 'e'])
     }
df = pd.DataFrame(d)
print(df)
print(df.rename(columns={'attri_1':'new'}))

运行结果：

   attri_1
a      5.1
b      4.2
c      2.3
d      1.4
e      3.5
   new
a  5.1
b  4.2
c  2.3
d  1.4
e  3.5

Python Pandas 数据拼接/排序/重置

Pandas 数据拼接/排序/重置

一、数据拼接

1.1 行拼接（纵向，第 0 维）pd.concat([df1,df2])

1.2 列拼接（横向，第 1 维）pd.concat([df1,df2], axis=1)

二、排序

2.1 从小到大排序（默认）df.sort_values()

2.2 从大到小排序（逆序）df.sort_values(ascending =False)

2.3 对 index 进行排序 df.sort_index()

2.4 对 index 进行重置（变成默认的0～n）df.reset_index()

三、对列标签重命名 df.rename()

猜你喜欢