版权声明:诸葛老刘所有 https://blog.csdn.net/weixin_39791387/article/details/84863660
最近在工作中需要用到对pandas的列数据进行sum()
统计,那就需要新增一行数据
实现方法如下:
import pandas as pd
import numpy as np
df = pd.DataFrame([
{'date': '2018-12-01', 'total': 100, 'total2': 100.23},
{'date': '2018-12-02', 'total': 102, 'total2': 2312.13},
{'date': '2018-12-03', 'total': 112, 'total2': 123.32},
{'date': '2018-12-04', 'total': 134, 'total2': 3453.23}
])
# 需求是对'total', 'totalarea' 两列的数据 进行np.sum()操作
df2 = df
df2 = df2.set_index('date') # 将date设为index,不进行sum()计算
df2.loc['Sum'] = df2.apply(lambda x: np.sum(x)) # 关键步骤
print(df2)
# output
total total2
date
2018-12-01 100.0 100.23
2018-12-02 102.0 2312.13
2018-12-03 112.0 123.32
2018-12-04 134.0 3453.23
Sum 448.0 5988.91
#扩展需求: 对行数据进行SUM(),
df3 = df
df3['col_sum'] = df3.apply(lambda x: np.sum(x[1:]), axis=1)
# 等同于下面的写法
df3['col_sum'] = df3.apply(lambda x: np.sum([x['total'], x['total2']]), axis=1)
print(df3)
# output
date total total2 col_sum
0 2018-12-01 100 100.23 200.23
1 2018-12-02 102 2312.13 2414.13
2 2018-12-03 112 123.32 235.32
3 2018-12-04 134 3453.23 3587.23