pandas.DataFrame.fillna()函数
DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
功能
- 使用指定方法填充NA/NaN值
参数
- value: 变量、字典、Series,DataFrame;用于填充填充缺失值,或指定为每个索引(对于Series)或列(对于DataFrame)的缺失值使用字典/Series/DataFrame的值填充
- method: {'backfill', 'bfill', 'pad', 'ffill', None}, 默认None, pad/ffill表示向后填充空值,backfill/bfill表示向前填充空值
- axis: {0 or 'index', 1 or 'columns'}
- inplace: boolean, 默认为False。若为True, 在原地填满
- limit: int, 默认为None, 如果指定了方法, 则这是连续的NaN值的前向/后向填充的最大数量
- downcast: dict, 默认None, 字典中的项为类型向下转换规则。
实例
import pandas as pd
import numpy as np
a = np.arange(25, dtype=float).reshape((5, 5))
print(len(a))
for i in range(len(a)):
a[i, :i] = np.nan
a[3, 0] = 25.0
df = pd.DataFrame(data=a, columns = list('ABCDE'))
print(df)
A B C D E
0 0.0 1.0 2.0 3.0 4.0
1 NaN 6.0 7.0 8.0 9.0
2 NaN NaN 12.0 13.0 14.0
3 25.0 NaN NaN 18.0 19.0
4 NaN NaN NaN NaN 24.0
1、用0填充
print(df.fillna(value=0))
A B C D E
0 0.0 1.0 2.0 3.0 4.0
1 0.0 6.0 7.0 8.0 9.0
2 0.0 6.0 12.0 13.0 14.0
3 25.0 6.0 12.0 18.0 19.0
4 25.0 6.0 12.0 18.0 24.0
2、向后填充
print(df.fillna(method='pad'))
A B C D E
0 0.0 1.0 2.0 3.0 4.0
1 0.0 6.0 7.0 8.0 9.0
2 0.0 6.0 12.0 13.0 14.0
3 25.0 6.0 12.0 18.0 19.0
4 25.0 6.0 12.0 18.0 24.0
3、向前填充
print(df.fillna(method='backfill'))
A B C D E
0 0.0 1.0 2.0 3.0 4.0
1 25.0 6.0 7.0 8.0 9.0
2 25.0 NaN 12.0 13.0 14.0
3 25.0 NaN NaN 18.0 19.0
4 NaN NaN NaN NaN 24.0
4、用字典填充
# 字典填充
values = {'A': 0, 'B': 1, 'C': 2, 'D': 3, 'E': 4}
print(df.fillna(value=values))
A B C D E
0 0.0 1.0 2.0 3.0 4.0
1 0.0 6.0 7.0 8.0 9.0
2 0.0 1.0 12.0 13.0 14.0
3 25.0 1.0 2.0 18.0 19.0
4 0.0 1.0 2.0 3.0 24.0
5、只替换第1个NaN值
print(df.fillna(method='pad', limit=1))
A B C D E
0 0.0 1.0 2.0 3.0 4.0
1 0.0 6.0 7.0 8.0 9.0
2 NaN 6.0 12.0 13.0 14.0
3 25.0 NaN 12.0 18.0 19.0
4 25.0 NaN NaN 18.0 24.0