python笔试面试项目实战2020百练9：pandas如何基于部分行或列的值生成新的行或列

pandas如何基于部分行或列的值生成新的行或列

现有DataFrame如下，请增加一列c，值为a，b列之和。

>>> import pandas as pd
>>>
>>> df = pd.DataFrame({'a':[1,2], 'b':[3,4]})
>>> df
   a  b
0  1  3
1  2  4

参考资料

参考答案

>>> import pandas as pd
>>> df = pd.DataFrame({'a':[1,2], 'b':[3,4]})
>>> df['c'] = df.apply(lambda row: row.a + row.b, axis=1)
>>> df
   a  b  c
0  1  3  4
1  2  4  6
>>> def add(x):
...     return x.a + x.b
...
>>> df['d'] = df.apply(add,  axis=1)
>>> df
   a  b  c  d
0  1  3  4  4
1  2  4  6  6
>>> df['e'] = df.apply(sum,  axis=1)
>>> df
   a  b  c  d   e
0  1  3  4  4  12
1  2  4  6  6  18
>>> df['f'] = df.apply(add,  axis=1)
>>> df
   a  b  c  d   e  f
0  1  3  4  4  12  4
1  2  4  6  6  18  6
>>> df.loc[len(df)] =  df.apply(sum,axis=0)
>>> df
   a  b   c   d   e   f
0  1  3   4   4  12   4
1  2  4   6   6  18   6
2  3  7  10  10  30  10

Pandas Sum

补充知识 pandas.DataFrame.apply

DataFrame.``apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)[source]

func:

应用于每个列或行的函数。

axis{0 or ‘index’, 1 or ‘columns’}, default 0

0或'index'：将函数应用于每一列。
1或“列”：将功能应用于每一行。

raw: bool, default False

确定是否将行或列作为Series或ndarray对象传递：
False ：将每个行或列作为系列传递给该函数。
True ：传递的函数将改为接收ndarray对象。如果您仅应用NumPy reduction 功能，则将获得更好的性能。

result_type{‘expand’, ‘reduce’, ‘broadcast’, None}, default None
这些仅在axis=1 （列）时起作用：
'expand'：类似列表的结果。
'reduce'：如果可能，返回一个Series，而不是expand似列表的结果。
'broadcast'：结果将以DataFrame的原始形状进行广播，原始索引和列将保留。

默认行为 (None) 取决于所应用函数的返回值：类似于列表的结果将作为一系列结果返回。但是，如果apply函数返回Series，则这些列将扩展为列。为0.23.0版中的新功能。

args：tuple
除数组/序列外，还传递给func的位置参数。
**kwds

作为关键字参数传递给func的其他关键字参数。

返回：Series or DataFrame

>>> df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])
>>> df
   A  B
0  4  9
1  4  9
2  4  9
>>> df.apply(np.sqrt)
     A    B
0  2.0  3.0
1  2.0  3.0
2  2.0  3.0
>>> df.apply(np.sum, axis=0)
A    12
B    27
dtype: int64
>>> df.apply(np.sum)
A    12
B    27
dtype: int64
>>> df.apply(np.sum, axis=1)
0    13
1    13
2    13
dtype: int64
>>> df.apply(sum, axis=1)
0    13
1    13
2    13
dtype: int64
>>> df.apply(lambda x: [1, 2], axis=1)
0    [1, 2]
1    [1, 2]
2    [1, 2]
dtype: object
>>> df.apply(lambda x: [1, 2], axis=1, result_type='expand')
   0  1
0  1  2
1  1  2
2  1  2
>>> df.apply(lambda x: pd.Series([1, 2], index=['foo', 'bar']), axis=1)
   foo  bar
0    1    2
1    1    2
2    1    2

最后编辑于：2021.04.25 17:09:52