pandas.DataFrame.apply() 具体应用 实现新增 统计行 或 统计列

版权声明:诸葛老刘所有 https://blog.csdn.net/weixin_39791387/article/details/84863660

最近在工作中需要用到对pandas的列数据进行sum()统计,那就需要新增一行数据
实现方法如下:

import pandas as pd
import numpy as np
df = pd.DataFrame([
    {'date': '2018-12-01', 'total': 100, 'total2': 100.23},
    {'date': '2018-12-02', 'total': 102, 'total2': 2312.13},
    {'date': '2018-12-03', 'total': 112, 'total2': 123.32},
    {'date': '2018-12-04', 'total': 134, 'total2': 3453.23}
    ])

# 需求是对'total', 'totalarea' 两列的数据 进行np.sum()操作
df2 = df
df2 = df2.set_index('date') # 将date设为index,不进行sum()计算
df2.loc['Sum'] = df2.apply(lambda x: np.sum(x))  # 关键步骤
print(df2)
# output
            total   total2
date                      
2018-12-01  100.0   100.23
2018-12-02  102.0  2312.13
2018-12-03  112.0   123.32
2018-12-04  134.0  3453.23
Sum         448.0  5988.91

#扩展需求: 对行数据进行SUM(),
df3 = df
df3['col_sum'] = df3.apply(lambda x: np.sum(x[1:]), axis=1)
# 等同于下面的写法
df3['col_sum'] = df3.apply(lambda x: np.sum([x['total'], x['total2']]), axis=1)

print(df3)
# output
         date  total   total2  col_sum
0  2018-12-01    100   100.23   200.23
1  2018-12-02    102  2312.13  2414.13
2  2018-12-03    112   123.32   235.32
3  2018-12-04    134  3453.23  3587.23

猜你喜欢

转载自blog.csdn.net/weixin_39791387/article/details/84863660