(2026.04.04 Sat @崇左)
滑动窗口计算 pandas.rolling
pandas.rolling方法用于计算DataFrame中Series或DataFrame对象的滑动窗口的各项指标,聚合和变换函数包括min, max, sum, mean,对时间序列分析和平滑噪声数据尤其有效。
案例:
import pandas as pd
import numpy as np
df = pd.DataFrame({'value': [10, 20, 30, np.nan, 50, 60]})
# 计算窗口尺寸为3的均值
rolling_mean = df['value'].rolling(window=3, min_periods=1).mean()
print(rolling_mean)
0 10.000000
1 15.000000
2 20.000000
3 25.000000
4 40.000000
5 55.000000
Name: value, dtype: float64
可自定义参数:
-
window: 窗口尺寸 -
min_periods: 用于计算值的最小观察区间,默认值和window值相同 -
center:Align labels at the center of the window if True. -
win_type:应用加权窗,e.g.,triang,guassian,默认均匀加权 -
on:用于基于时间的窗的字段 -
step(pandas 1.5+):Evaluate every nth step in the rolling window.
返回的结果和原Series的尺寸相同,无法赋值的部分用NaN填充。
pandas.resample
pandas.resample/df.resample方法用于对时间序列进行重采样,允许用户将一个时间序列从一种频率转换为另一种频率,并可对常规时间序列进行聚合或计算统计值。
案例:
import pandas as pd
# 创建
dates = pd.date_range(start='2023-01-01', end='2023-12-31', freq='D')
data = np.random.rand(len(dates))
df = pd.DataFrame(data, index=dates, columns=['Random Data'])
# 重采样改为月度数据
monthly_resampled_data = df.resample('M').mean()
print(monthly_resampled_data.head())
参数:
-
rule: The offset string or object representing target conversion. -
axis: Which axis to use for up- or down-sampling. -
closed: Which side of bin interval is closed. -
label: Which bin edge label to label bucket with. -
convention: For PeriodIndex only, controls whether to use the start or end of rule. -
on: For a DataFrame, column to use instead of index for resampling. Column must be datetime-like. -
level: For a MultiIndex, level (name or number) to use for resampling. Level must be datetime-like.